Results 1 to 7 of 7
  1. #1
    Join Date
    Jul 2005
    Location
    MD
    Posts
    6

    Unanswered: INNER JOIN on on multiple columns

    I'm trying to eliminate all records that do not have one of two conditions. I'm using INNER JOIN on a derived "table", not a table in my database. The code below summarizes what I'm trying to do. Please note that this is an extremely simplified query.

    ---------------------------

    SELECT * FROM jobs
    INNER JOIN
    (
    SELECT contact_id FROM contacts WHERE deleted = 0
    )AS ValidContacts
    ON (jobs.owner = ValidContacts.contact_id OR jobs.assignee = ValidContacts.contact_id)

    ---------------------------

    This works fine when the the "SELECT contact_id FROM contacts WHERE deleted = 0" part returns a small number of records, however when that part returns a very large number of records, the query hangs and never completes. If I remove one of the conditions for the JOIN, it works fine, but I need both. Why doesn't this work?

    Another possible solution is if I were to use "WHERE/IN" like this:

    ---------------------------

    SELECT * FROM jobs
    WHERE owner IN (SELECT contact_id FROM contacts WHERE deleted = 0)
    OR assignee IN (SELECT contact_id FROM contacts WHERE deleted = 0)

    ---------------------------

    This would work fine, but I don't want to have to run the "SELECT contact_id FROM contacts WHERE deleted = 0" part twice (since in my real code, it is much more complicated and performance is a big issue". Any help would be greatly appreceated.

    I'm using SQL Server 2000 on Windows XP Pro.

  2. #2
    Join Date
    Feb 2003
    Location
    India
    Posts
    216
    could not understand why are you using subquery which always performs poor. a straight join should give you the same result if there are no other complications, not mentioned here.

    you can rearrange the query like this

    SELECT jobs.* FROM jobs
    INNER JOIN contacts
    ON (jobs.owner = Contacts.contact_id OR jobs.assignee = Contacts.contact_id)
    where contacts.deleted = 0

  3. #3
    Join Date
    Jul 2005
    Location
    MD
    Posts
    6
    The subquery in my example is much simpler than the one in my real script. It gets values from multiple tables and compiles them. But, even the INNER JOIN in your example doesn't work on large databases. Try running your script on two large tables (my "jobs" table is 20224 records long and my "contacts" table is 3946 records long), it just hanges there and takes forever to execute. It has something to do with the OR oporator, because using my second solution (with the WHERE/IN clauses) executed quickly.

    Thanks for your reply though.

  4. #4
    Join Date
    Aug 2001
    Posts
    24

    Inner Join

    SELECT * FROM jobs
    INNER JOIN
    (
    SELECT contact_id FROM contacts WHERE deleted = 0
    ) AS ValidContacts1
    ON (jobs.owner = ValidContacts1.contact_id)
    INNER JOIN
    (
    SELECT contact_id FROM contacts WHERE deleted = 0
    ) AS ValidContacts2
    ON (jobs.assignee = ValidContacts2.contact_id)

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by HighRise
    Try running your script on two large tables (my "jobs" table is 20224 records long and my "contacts" table is 3946 records long), it just hanges there and takes forever to execute.
    By no stretch of the imagination would these be considered large tables for SQL Server. So something else is going on...
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Jul 2005
    Location
    MD
    Posts
    6
    Quote Originally Posted by hkamatgi
    SELECT * FROM jobs
    INNER JOIN
    (
    SELECT contact_id FROM contacts WHERE deleted = 0
    ) AS ValidContacts1
    ON (jobs.owner = ValidContacts1.contact_id)
    INNER JOIN
    (
    SELECT contact_id FROM contacts WHERE deleted = 0
    ) AS ValidContacts2
    ON (jobs.assignee = ValidContacts2.contact_id)
    This is really no different from my alternate script at the beginning of this post (the one using WHERE/IN). It may work but I don't want to use it because I only want to have to run the inner query once since the one I'm actually using takes almost a second to execute. Thanks anyway.

  7. #7
    Join Date
    Jul 2005
    Location
    MD
    Posts
    6
    Quote Originally Posted by blindman
    By no stretch of the imagination would these be considered large tables for SQL Server. So something else is going on...
    So, are you saying that using something like:

    SELECT jobs.* FROM jobs
    INNER JOIN contacts
    ON (jobs.owner = Contacts.contact_id OR jobs.assignee = Contacts.contact_id)
    where contacts.deleted = 0

    should work? I've search the web and haven't found any uses of "OR" in an INNER JOIN. What else could it be if not the OR (since using AND works...)?

    Another idea I've had is to use a temporary table. Something like:

    Code:
    SELECT contact_id
    INTO #Temp
    FROM contacts
    WHERE deleted = 0
    
    SELECT * FROM jobs
    WHERE owner IN (SELECT contact_id FROM #Temp)
    OR assignee IN (SELECT contact_id FROM #Temp)
    
    DROP TABLE #Temp
    Now, obviously this looks extremely unnecessary given the simplicity of my contacts query, but since the actual query takes almost a second, there is a noticeable performance improvement when I use this method. However, I have no idea what the dangers are. For instance, what if two user happen to run this at the same time? Would the two temporary tables with the same name (#Temp) conflict with each other?

    Anyway, thanks for your suggestions so far.
    Last edited by HighRise; 01-12-06 at 18:55.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •