Page 1 of 2 12 LastLast
Results 1 to 15 of 18
  1. #1
    Join Date
    Feb 2010
    Posts
    75

    Unanswered: Where Not In (multiple columns)

    I frequently do queries like this:

    Code:
    select * from Products
    where ProductID not in (
       select ProductID from Orders)
    For example if I wanted all the products that have never been ordered.


    Now I want to do something similar, but the primary key on my table is a multi-column key. Whats the syntax to do a query like that? For example:

    (this doesnt work)
    Code:
    select * from Products
    where (ProductID, ProductID2) not in (
       select (ProductID1, ProductID2) from Orders)

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Use EXISTS()
    Testimonial:
    pootle flump
    ur codings are working excelent.

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by pootle flump View Post
    Use EXISTS()
    an even better hint would be to use NOT EXISTS

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    I need to get in the habit of using EXISTS - I still use the LEFT JOIN test for NULLs method...
    George
    Home | Blog

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I really dislike that method. I am happy to list all the reasons why but I really think it sucks.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  6. #6
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Quote Originally Posted by pootle flump View Post
    I am happy to list all the reasons why but I really think it sucks.
    You shouldn't have offered
    Teach me o flumpish one
    George
    Home | Blog

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I knew you wouldn't be able to resist the wise teachings of a sage and venerated flump.
    Tomorrow padwan - I'm off to rugby training
    Testimonial:
    pootle flump
    ur codings are working excelent.

  8. #8
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    One of the main reasons is performance. The left outer join has to look at every row that qualifies on the join. The exists/not exists, just finds the first one and stops.
    Dave

  9. #9
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Reasons I use EXISTS() and never LEFT JOIN... IS [NOT] NULL:

    1. Consistency: if I reference a table in my FROM clause then there will always be at least one column from that table in the SELECT clause (or in theory GROUP BY, though I include that only really for completeness). Always. SELECT...FROM....: that’s what my queries do
    2. Ease of reading (follows on from above): EXISTS is simple, obvious, self contained. Even without knowing SQL you can have a good guess at what it does. LEFT JOIN... IS [NOT] NULL means you have to cross reference the FROM and WHERE clauses, possibly parsing out a load of other stuff extraneous to that particular operation. You effectively have to re-evalate the FROM clause based on the contents of WHERE
    3. Reversibility: Opposite of EXISTS? NOT EXISTS. Opposite of LEFT JOIN... IS NULL? Change your LEFT join to an INNER JOIN and remove the NULL check in the WHERE clause. Depending on the cardinality of the relationship you may need to throw a DISTINCT in there
    4. Efficiency: In very simple queries they can be the same. Any more than simple though and the optimiser uses a [ANTI] SEMI-JOIN for EXISTS which is very efficient. If you have to use DISTINCT for your LEFT JOIN... IS NOT NULL then this is even more true. Only once have I seen an EXISTS elicit a worse plan than LEFT JOIN... IS NULL and that was some years ago on SQL 2k
    Testimonial:
    pootle flump
    ur codings are working excelent.

  10. #10
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Sounds like a case for me to brush up on my EXISTS syntax!

    A sly comment on the reversibility; change the WHERE to an IS NOT NULL But I do concede the point entirely.

    Picking up on dav1mo's explanation, specifically this bit:
    The exists/not exists, just finds the first one and stops.
    In a NOT EXISTS scenario on a non-indexed column, won't the optimizer have to scan all the rows to find a row that doesn't exist?

    I appreciate that the same is true fo a non-indexed join, but I just want to make sure I've got my head round this one
    George
    Home | Blog

  11. #11
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by gvee View Post
    A sly comment on the reversibility; change the WHERE to an IS NOT NULL
    You are not that sloppy a coder George.

    Yes you are correct - it will have to scan but it will stop at the first match. If the first match is the first row then it is MILES more efficient. If the first match is the last row then it is no more efficient.

    The physical operation Dave talks about is known as a SEMI-JOIN. Craig Freedman's blog is awesome, and his explanation of all the JOINs is fantastic reading. This is an intro (he does an article on each) but it mentions SEMI-JOINs:
    Introduction to Joins - Craig Freedman's SQL Server Blog - Site Home - MSDN Blogs

    EDIT - that article is about logical JOINs. When I mentioned he has great articles on the various joins I meant the physical joins (HASH, MERGE etc.) not the ones in the article.
    Last edited by pootle flump; 07-23-10 at 06:48.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  12. #12
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Quote Originally Posted by pootle flump
    You are not that sloppy a coder George.
    Don't worry, I'd never do that, but I just wanted to point out the possibilities

    Interestingly enough I just wrote an EXISTS query about 2 minutes ago for a colleague and the syntax is easier than I remember - I don't think I have to learn anything new other than working out a pretty (and consistant) way to tab out my [NOT] EXISTS clauses
    George
    Home | Blog

  13. #13
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Apologies for hijacking your thread a littel BeerOClock... I take it you've got the answer you needed from these posts? If not let us know
    George
    Home | Blog

  14. #14
    Join Date
    Feb 2010
    Posts
    75
    Uh, thanks for the super detailed explanation of why I should not use the "LEFT JOIN... IS [NOT] NULL" method. I should point out that I dont know what that method is. But if i ever see it, I hope I'll remember to steer clear of it.

    Just to wrap this thread up, can I please have a simple example of how to use NOT EXISTS to solve my original problem?

    Thanks!

  15. #15
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Try:
    Code:
    select * from Products p
    where not exists (select 1 from orders o
                            where p.ProductID   = o.ProductID1
                                and p.ProductID2 = oProductID2)
    and for your first query:

    Code:
    select * from Products p
    where not exists (select 1 from Orders o
                            where p.ProductID = o.ProductID )
    Both of these will, also, be magnitudes faster if you have an index on ProductID and ProductID2 in the orders table. As now you would have something to match on the index with the where clause, whereas before you had no where clause and were searching every row of the table.
    Dave Nance

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •