If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > ANSI SQL > JOIN vs Agregations

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-07-07, 10:56
gtk gtk is offline
Registered User
 
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
Post JOIN vs Agregations

Hi,
Maybe the topic have already been touched
in that case don't hesitate to give me a link on
the corresponding thread.

I don't know the history concerning the managment
of joined tables through JOIN conditions or WHERE
statements.
But I was wondering if ANSI plans to manage aggregations
with JOIN syntaxes (like subqueries when we deal with
effectives dates for example) or if the future is given to
the subqueries

So do we have to keep using the JOIN syntax for his
lisibility or finally use only subqueries or both ?
Reply With Quote
  #2 (permalink)  
Old 11-07-07, 11:08
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
i'm not sure i understand your question, but i think the answer is "both"

it would be best if you could give an actual example of the sql
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 11-10-07, 23:59
r123456 r123456 is offline
Registered User
 
Join Date: Sep 2003
Location: The extremely Royal borough of Kensington, London
Posts: 778
There is a problem with your question, which is you are comparing two completely different concepts. It is therefore not possible to give you an answer specifically addressing your question, however, I can comment on the use of aggregate functions and their relationship with the Cartesian product of two sets.

The Cartesian product of two sets is not required in order to use an aggregate function. Conversely, one is not required to use aggregate functions in producing the Cartesian product of two sets.
__________________
Bessie Braddock: Winston, you are drunk!
Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.
Reply With Quote
  #4 (permalink)  
Old 11-30-07, 08:49
gtk gtk is offline
Registered User
 
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
Sorry I have badly presented my question
Let me give you an example

Let's imagine those two tables
Code:
+------------+
| employee   |
+------------+
| empl_id    |
| first_name |
| last_name  |
| company_id |
+------------+

+-----------+
| empl_job  |
+-----------+
| empl_id   |
| eff_date  |
| job_title |
+-----------+
Let's imagine we'd like to get the current job title
for each employees of the company xxx
Is it better to do

Code:
SELECT
   a.first_name AS "FirstName"
 , a.last_name AS "LastName"
 , b.job_title AS "JobTitle"
FROM
 employee AS a
 JOIN empl_job AS b ON (
  a.empl_id = b.empl_id
  b.eff_date = (
   SELECT MAX(eff_date)
   FROM empl_job AS c
   WHERE b.job_id = c.job_id
     AND a.empl_id = c.empl_id
     AND eff_date <= SYSDATE
  )
 )
WHERE
 a.company_id = xxx
or

Code:
SELECT
   a.first_name AS "FirstName"
 , a.last_name AS "LastName"
 , b.job_title AS "JobTitle"
FROM
 employee AS a
 JOIN empl_job AS b ON (a.empl_id = b.empl_id)
WHERE a.company_id = xxx
   AND b.eff_date = (
    SELECT MAX(eff_date)
    FROM empl_job AS c
    WHERE b.job_id = c.job_id
     AND a.empl_id = c.empl_id
     AND eff_date <= SYSDATE
   )
  )
or

Code:
SELECT
   a.first_name AS "FirstName"
 , a.last_name AS "LastName"
 , b.job_title AS "JobTitle"
FROM
   employee AS a
 , empl_job AS b
WHERE a.company_id = xxx
   AND a.empl_id = b.empl_id
   AND b.eff_date = (
    SELECT MAX(eff_date)
    FROM empl_job AS c
    WHERE b.job_id = c.job_id
     AND a.empl_id = c.empl_id
     AND eff_date <= SYSDATE
   )
  )
?
Reply With Quote
  #5 (permalink)  
Old 11-30-07, 08:59
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
so it appears that your question is "which approach is better?"

the answer is: test them and see, each dbms is different

by the way i can think of a few more variations of SQL for your problem
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #6 (permalink)  
Old 11-30-07, 09:13
gtk gtk is offline
Registered User
 
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
With your experience,
What syntax do you considere the most maintainable ?
Which is the most portable ? and/or the most performant ?
And if I have to specify a DB, let's choose Oracle
Reply With Quote
  #7 (permalink)  
Old 11-30-07, 09:39
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
they are all equally maintainable

the most portable one is the one which does not use proprietary SQL like SYSDATE

performant? you will have to test them on your particular dbms
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #8 (permalink)  
Old 11-30-07, 12:36
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
The one that is easiest to maintain is the one that is easiest to understand by someone else. So maybe a few comments may help to explain what a subquery is doing or why certain constructs are used in specific places.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #9 (permalink)  
Old 02-21-08, 05:19
aschk aschk is offline
Registered User
 
Join Date: Mar 2007
Location: 636f6d7075746572
Posts: 770
What about:
Code:
SELECT
   a.first_name AS "FirstName"
 , a.last_name AS "LastName"
 , b.job_title AS "JobTitle"
FROM
 employee AS a
 JOIN empl_job AS b ON (a.empl_id = b.empl_id)
 JOIN (
  SELECT empl_id, MAX(eff_date) as latest
  FROM empl_job AS c
  GROUP BY empl_id
  ) as x ON x.empl_id = a.empl_id AND x.latest = b.eff_date
)
WHERE a.company_id = xxx
note: I think the above works the same as the others (i've not tested it).
Reply With Quote
  #10 (permalink)  
Old 02-21-08, 05:58
gtk gtk is offline
Registered User
 
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
Indeed this one seems good to maintain
Generally I prefere always use JOINs because for example
if you want to exclude a table you simply need to comment
the concerned JOIN part instead of having to seach and
comment all the conditions in the WHERE.
Reply With Quote
  #11 (permalink)  
Old 02-24-08, 07:38
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
Speaking about readability & maintainability: I prefer using "temporary views", also known as "common table expressions" (or CTEs):
Code:
WITH
a(empl_id, "FirstName", "LastName") AS
( SELECT empl_id, first_name, last_name FROM employee
  WHERE company_id = xxx
)
,
b(empl_id, eff_date, "JobTitle") AS
( SELECT empl_id, eff_date, job_title FROM empl_job )
,
x(empl_id, latest) AS
(
  SELECT empl_id, MAX(eff_date)
  FROM empl_job
  GROUP BY empl_id
)
SELECT  a."FirstName", a."LastName", b."JobTitle"
FROM
 a INNER JOIN b ON a.empl_id = b.empl_id
   INNER JOIN x ON x.empl_id = a.empl_id AND x.latest = b.eff_date
)
Performance should ideally be left to the DBMS' optimizer: no matter which formulation is used, the implementation (& performance) will be optimal.
(Reality is still different, of course, but DBMSs are evolving towards the ideal world ...)
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
Reply With Quote
  #12 (permalink)  
Old 02-24-08, 12:39
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Careful: The SQL standard mandates that CTEs are evaluated sequentially and before the main part of the SELECT statement. Of course, this only means that an implementation (DBMS) must produce a result as if this evaluation order was used - the implementation can do whatever it likes as long as the correct results are returned.

Therefore, while the rule to use CTEs is surely not a bad idea in many cases, I would just like to urge everyone else to not consider this as a hard rule.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On