Reasons I use EXISTS() and never LEFT JOIN... IS [NOT] NULL:
Consistency: if I reference a table in my FROM clause then there will always be at least one column from that table in the SELECT clause (or in theory GROUP BY, though I include that only really for completeness). Always. SELECT...FROM....: that’s what my queries do
Ease of reading (follows on from above): EXISTS is simple, obvious, self contained. Even without knowing SQL you can have a good guess at what it does. LEFT JOIN... IS [NOT] NULL means you have to cross reference the FROM and WHERE clauses, possibly parsing out a load of other stuff extraneous to that particular operation. You effectively have to re-evalate the FROM clause based on the contents of WHERE
Reversibility: Opposite of EXISTS? NOT EXISTS. Opposite of LEFT JOIN... IS NULL? Change your LEFT join to an INNER JOIN and remove the NULL check in the WHERE clause. Depending on the cardinality of the relationship you may need to throw a DISTINCT in there
Efficiency: In very simple queries they can be the same. Any more than simple though and the optimiser uses a [ANTI] SEMI-JOIN for EXISTS which is very efficient. If you have to use DISTINCT for your LEFT JOIN... IS NOT NULL then this is even more true. Only once have I seen an EXISTS elicit a worse plan than LEFT JOIN... IS NULL and that was some years ago on SQL 2k
A sly comment on the reversibility; change the WHERE to an IS NOT NULL
You are not that sloppy a coder George.
Yes you are correct - it will have to scan but it will stop at the first match. If the first match is the first row then it is MILES more efficient. If the first match is the last row then it is no more efficient.
Don't worry, I'd never do that, but I just wanted to point out the possibilities
Interestingly enough I just wrote an EXISTS query about 2 minutes ago for a colleague and the syntax is easier than I remember - I don't think I have to learn anything new other than working out a pretty (and consistant) way to tab out my [NOT] EXISTS clauses
Uh, thanks for the super detailed explanation of why I should not use the "LEFT JOIN... IS [NOT] NULL" method. I should point out that I dont know what that method is. But if i ever see it, I hope I'll remember to steer clear of it.
Just to wrap this thread up, can I please have a simple example of how to use NOT EXISTS to solve my original problem?
select * from Products p
where not exists (select 1 from orders o
where p.ProductID = o.ProductID1
and p.ProductID2 = oProductID2)
and for your first query:
select * from Products p
where not exists (select 1 from Orders o
where p.ProductID = o.ProductID )
Both of these will, also, be magnitudes faster if you have an index on ProductID and ProductID2 in the orders table. As now you would have something to match on the index with the where clause, whereas before you had no where clause and were searching every row of the table.