| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

11-07-07, 10:56
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
JOIN vs Agregations
|
|
Hi,
Maybe the topic have already been touched
in that case don't hesitate to give me a link on
the corresponding thread.
I don't know the history concerning the managment
of joined tables through JOIN conditions or WHERE
statements.
But I was wondering if ANSI plans to manage aggregations
with JOIN syntaxes (like subqueries when we deal with
effectives dates for example) or if the future is given to
the subqueries
So do we have to keep using the JOIN syntax for his
lisibility or finally use only subqueries or both ?
|
|

11-07-07, 11:08
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
i'm not sure i understand your question, but i think the answer is "both"
it would be best if you could give an actual example of the sql
|
|

11-10-07, 23:59
|
|
Registered User
|
|
Join Date: Sep 2003
Location: The extremely Royal borough of Kensington, London
Posts: 778
|
|
|
|
There is a problem with your question, which is you are comparing two completely different concepts. It is therefore not possible to give you an answer specifically addressing your question, however, I can comment on the use of aggregate functions and their relationship with the Cartesian product of two sets.
The Cartesian product of two sets is not required in order to use an aggregate function. Conversely, one is not required to use aggregate functions in producing the Cartesian product of two sets.
__________________
Bessie Braddock: Winston, you are drunk!
Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.
|
|

11-30-07, 08:49
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Sorry I have badly presented my question
Let me give you an example
Let's imagine those two tables
Code:
+------------+
| employee |
+------------+
| empl_id |
| first_name |
| last_name |
| company_id |
+------------+
+-----------+
| empl_job |
+-----------+
| empl_id |
| eff_date |
| job_title |
+-----------+
Let's imagine we'd like to get the current job title
for each employees of the company xxx
Is it better to do
Code:
SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
JOIN empl_job AS b ON (
a.empl_id = b.empl_id
b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)
WHERE
a.company_id = xxx
or
Code:
SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
JOIN empl_job AS b ON (a.empl_id = b.empl_id)
WHERE a.company_id = xxx
AND b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)
or
Code:
SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
, empl_job AS b
WHERE a.company_id = xxx
AND a.empl_id = b.empl_id
AND b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)
?
|
|

11-30-07, 08:59
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
so it appears that your question is "which approach is better?"
the answer is: test them and see, each dbms is different
by the way i can think of a few more variations of SQL for your problem
|
|

11-30-07, 09:13
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
With your experience,
What syntax do you considere the most maintainable ?
Which is the most portable ? and/or the most performant ?
And if I have to specify a DB, let's choose Oracle
|
|

11-30-07, 09:39
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
they are all equally maintainable
the most portable one is the one which does not use proprietary SQL like SYSDATE
performant? you will have to test them on your particular dbms
|
|

11-30-07, 12:36
|
|
Registered User
|
|
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
|
|
The one that is easiest to maintain is the one that is easiest to understand by someone else. So maybe a few comments may help to explain what a subquery is doing or why certain constructs are used in specific places.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
|
|

02-21-08, 05:19
|
|
Registered User
|
|
Join Date: Mar 2007
Location: 636f6d7075746572
Posts: 770
|
|
What about :
Code:
SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
JOIN empl_job AS b ON (a.empl_id = b.empl_id)
JOIN (
SELECT empl_id, MAX(eff_date) as latest
FROM empl_job AS c
GROUP BY empl_id
) as x ON x.empl_id = a.empl_id AND x.latest = b.eff_date
)
WHERE a.company_id = xxx
note: I think the above works the same as the others (i've not tested it).
|
|

02-21-08, 05:58
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Indeed this one seems good to maintain
Generally I prefere always use JOINs because for example
if you want to exclude a table you simply need to comment
the concerned JOIN part instead of having to seach and
comment all the conditions in the WHERE.
|
|

02-24-08, 07:38
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Speaking about readability & maintainability: I prefer using "temporary views", also known as "common table expressions" (or CTEs):
Code:
WITH
a(empl_id, "FirstName", "LastName") AS
( SELECT empl_id, first_name, last_name FROM employee
WHERE company_id = xxx
)
,
b(empl_id, eff_date, "JobTitle") AS
( SELECT empl_id, eff_date, job_title FROM empl_job )
,
x(empl_id, latest) AS
(
SELECT empl_id, MAX(eff_date)
FROM empl_job
GROUP BY empl_id
)
SELECT a."FirstName", a."LastName", b."JobTitle"
FROM
a INNER JOIN b ON a.empl_id = b.empl_id
INNER JOIN x ON x.empl_id = a.empl_id AND x.latest = b.eff_date
)
Performance should ideally be left to the DBMS' optimizer: no matter which formulation is used, the implementation (& performance) will be optimal.
(Reality is still different, of course, but DBMSs are evolving towards the ideal world ...)
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|

02-24-08, 12:39
|
|
Registered User
|
|
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
|
|
Careful: The SQL standard mandates that CTEs are evaluated sequentially and before the main part of the SELECT statement. Of course, this only means that an implementation (DBMS) must produce a result as if this evaluation order was used - the implementation can do whatever it likes as long as the correct results are returned.
Therefore, while the rule to use CTEs is surely not a bad idea in many cases, I would just like to urge everyone else to not consider this as a hard rule.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|