1. Registered User
Join Date
Jun 2004
Posts
127

## JOINS vs SUBQUERIES

Hi guys, just a quick Q in order to quash some confusion I have.

joins are expensive (especially outer & self joins). as a rule, are subqueries expensive too? - i believe correlated subq's can be.

for example, if each of emp & dept tables are physically stored on same disk, is statement 1 more costly than statement 2? if so why?

reason for my Q, is to determine if I am best to formulate queries prefering to use subq's over joins whereever possible.

Code:
```Select ename, job, sal
from emp join dept using (deptno)
where loc = 'NEW YORK';

Select ename, job, sal
from emp
where deptno = (select deptno from dept where loc = 'NEW YORK');```

2. World Class Flame Warrior
Join Date
Jun 2003
Location
Ohio
Posts
12,595
Who says joins are expensive?
Compared to what?
Are cars expensive? They cost a lot of money, but provide a lot of value.
Which of your two methods is more efficient is probably most dependent upon the database server you are using. I suggest you post your question in one of the server-specific forums.

3. Registered User
Join Date
Oct 2002
Location
Posts
697
Originally Posted by FAC51
joins are expensive (especially outer & self joins). as a rule, are subqueries expensive too? - i believe correlated subq's can be.
Take a subquery of the form:

Code:
```SELECT *
FROM X
WHERE a IN (SELECT a FROM P)```
We'll say that P is either another table or a view.

All we're really saying is:

Code:
```SELECT X.*
FROM X INNER JOIN P ON X.a = P.a```
So a subquery is equivalent to a join. (It's trickier if you use "not in" or if the restrict expression is more complex, but that can all be reduced to joins as well. At least, I've done the proofs in proper relational algebra... SQL might make things much harder.) And if I were writing an optimizer, I'd have it rewrite my queries as much as possible to give me a simple sequence of joins.

The reason being that the biggest problem with joins is ordering them correctly and making the best use of indexes. But because a subquery is translated to a join by the optimizer, it generally shouldn't be any faster or slower.

4. Registered User
Join Date
Oct 2002
Location
Posts
697
Originally Posted by FAC51
Code:
```Select ename, job, sal
from emp join dept using (deptno)
where loc = 'NEW YORK';

Select ename, job, sal
from emp
where deptno = (select deptno from dept where loc = 'NEW YORK');```
Oh, yeah, in principle a subquery using "=" is the same as one using "IN" except that I'm not sure what SQL is supposed to do if your subquery returns two rows. I also think it violates common sense to say that a table is equal to a number, but that's SQL for you.

Anyhw, your query is roughly the same as saying:

Code:
```SELECT emp.ename, emp.job, emp.sal
FROM emp INNER JOIN dept ON emp.deptno = dept.deptno
WHERE dept.loc = 'New York'```
Except that I know what would happen if there were two entries in dept for a given deptno. (You might have to put DISTINCT in there...) I'm guessing that deptno is a candidate key (meaning it's either a primary key or a UNIQUE constraint) and that that's not an issue.

There's also "EXISTS" type subqueries. These are SQL's inheritance from the relational calculus.

Code:
```SELECT * FROM X
WHERE EXISTS (SELECT * FROM P WHERE P.a = X.a)```
We can still get that behavior like so:

Code:
```SELECT X.*
FROM X INNER JOIN P ON X.a = P.a```

5. Registered User
Join Date
Jun 2004
Posts
127
thanks for your input sco08y, things are much clearer now.

6. Village Idiot
Join Date
Jul 2003
Location
Michigan
Posts
1,941
Originally Posted by FAC51
thanks for your input sco08y, things are much clearer now.
heh

with each answer comes more questions

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•