Results 1 to 2 of 2
  1. #1
    Join Date
    Sep 2002
    Location
    Montreal, Canada
    Posts
    224
    Provided Answers: 1

    Unanswered: Extracting main domains

    Hi folks!

    I have a table of domain name from which I want to extract the main domains. I am using MySQL 3.23.58 so subselects can't be used.

    For instance:

    I have a topleveldomains table
    1 qc.ca
    2 on.ca
    3 ca
    4 com
    5 gouv.ca

    I have a domains table
    1 toto.qc.ca
    2 toto.ca
    3 xyz.com
    4 mysql.qc.ca
    5 city.toronto.on.ca

    I would like to get:
    toto
    xyz
    mysql
    city.toronto

    My problem is that some top level domains are "substring" of others... For instance, ca is a substring of qc.ca so in that case, my join matches
    the following:

    mysql.qc.ca ---> ca
    mysql.qc.ca ---> qc.ca

    So I end up thinking that I found 2 main domains: mysql.qc and mysql. Also note that a main domain can also contain a period (city.toronto).

    Any idea?

    Thanks in advance!

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    the problem lies in your topleveldomains table

    rows 1 and 3 are the cause

    perhaps what you need is to run two queries to do the matching -- the first will check for any 2-part top level (e.g. qc.ca), while the second will check for any 1-part top level that weren't already matched by the first query

    without subqueries, you'll need a UNION and for the second query you'll need a LEFT OUTER JOIN

    however, there's still a problem

    you say you want "toto" as the result, which leads me to believe that you don't care that two completely different sites were matched against two completely different topleveldomains if you only want to see "toto" once in the results

    therefore you can use UNION ALL and skip the LEFT OUTER JOIN in the second query

    does that make sense?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •