If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > How does join works

View Poll Results: Is this question concerned with DB2 administrator
yes 0 0%
no 0 0%
can't say 3 100.00%
Voters: 3. You may not vote on this poll

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-21-11, 12:15
rahulvats2007 rahulvats2007 is offline
Registered User
 
Join Date: Sep 2011
Posts: 3
Red face How does join works

hi all, could smone help me in understand how db2 joins internally. I have read that the order of SQL is as

FROM
WHERE
GROUP BY
HAVING
SELECT

For the query below:

SELECT<some cols>
FROM TMIB_MBR_REIMB TMIB1
,TSLD_BCH_CLM_DET
LEFT OUTER JOIN
TIBP_ISS_BCH_PART
ON SLD_ISS_ID = IBP_IBA_AIS_ID
AND SLD_BCH_ID = IBP_IBA_BCH_ID
AND SLD_PIT_SEQ_NB = IBP_PIT_SEQ_NB
WHERE (MIB_CHK_GID_ID = 0
OR MIB_CHK_GID_ID = 888888888)
AND SLD_ISS_ID = MIB_ISS_ID
AND SLD_BCH_ID = MIB_BCH_ID
AND MIB_PIT_SEQ_NB = SLD_PIT_SEQ_NB
AND MIB_PIT_CLM_SEQ_NB = SLD_PIT_CLM_SEQ_NB

I believe that :
1) DB2 first prepare a result set say R1 by cartesian product of
TMIB_MBR_REIMB TMIB1 and TSLD_BCH_CLM_DET.
2) R1 is then left outer joined with TIBP_ISS_BCH_PART to give R2
3) Now as per sequence the WHERE clause is executed to filter out rows from
R2.

Please confirm if I am correct.
Reply With Quote
  #2 (permalink)  
Old 09-21-11, 12:50
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
i am not as familiar with DB2 as other database engines, but there is a very good chance that this --
Code:
TSLD_BCH_CLM_DET 
LEFT OUTER JOIN 
TIBP_ISS_BCH_PART
is executed first, and the results then inner joined to TMIB_MBR_REIMB TMIB1
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 09-21-11, 12:51
ARWinner ARWinner is offline
Registered User
 
Join Date: Jan 2003
Posts: 3,575
You are incorrect. DB2 does not do the same thing for each type of query. What DB2 does do, is examine the statistics of the tables, look at the indexes, look at the predicates, then determine what it thinks is the best access plan. Small changes to any part of this can greatly affect the access plan.

Andy
Reply With Quote
  #4 (permalink)  
Old 09-21-11, 13:23
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by ARWinner View Post
You are incorrect.
thanks for putting it so tactfully

did you examine the way that the three tables are joined?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 09-21-11, 13:30
ARWinner ARWinner is offline
Registered User
 
Join Date: Jan 2003
Posts: 3,575
My response was to the original post, not your reply. My experience has shown that DB2 will pick whatever access plan it deems is the fastest bast on the statistics and available indexes. A lot of times, how I expect the data to be read and how DB2 actually reads the data are totally different. One cannot just say that with a particular type of query, DB2 will always use a specific access plan. It is more complicated than that.

Andy
Reply With Quote
  #6 (permalink)  
Old 09-21-11, 14:03
sathyaram_s sathyaram_s is offline
Super Moderator
 
Join Date: Aug 2001
Location: UK
Posts: 4,534
Quote:
DB2 first prepare a result set say R1 by cartesian product of
TMIB_MBR_REIMB TMIB1 and TSLD_BCH_CLM_DET.
I have not seen a cartesian product being built to do perform a join.

Have you tried an explain plan.


===
__________________
Visit the new-look IDUG Website , register to gain access to the excellent content.
Reply With Quote
  #7 (permalink)  
Old 09-21-11, 17:20
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
In DB2 for LUW SQL Reference Volume 1
Quote:
The clauses of the subselect are processed in the following sequence:
1. FROM clause
2. WHERE clause
3. GROUP BY clause
4. HAVING clause
5. SELECT clause
6. ORDER BY clause
7. FETCH FIRST clause
The sequence is logical sequence.
If same result could be produced more efficiently with another way,
DB2 optimizer is smart enough to use the way(access path) in actual access.

Example 1: DB2 may produce the result of the following query without accessing the employee table.
Because, empno column was defined NOT NULL.
Then "(number of rows with empno is null) = 0" is apparent without accessing the table.
Code:
------------------------------ Commands Entered ------------------------------
SELECT COUNT(*) AS count_rows
 FROM  employee
 WHERE empno IS NULL
;
------------------------------------------------------------------------------

COUNT_ROWS 
-----------
          0

  1 record(s) selected.
Example 2: Another popular example is DB2 try to apply conditions in WHERE clause as early as possible.
It may be the time accessing tables in FROM clause,
and it may be before joining with another tables,
if the same result was guaranteed as processed in the logical sequence.
Reply With Quote
  #8 (permalink)  
Old 09-21-11, 17:42
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
You can see an example of the importance of the understanding of the (logical) processing sequence in
alais name in the group by clause

Another effective use of the logical processing sequence may be designing/uderstanding of complex queries.
It must be great help to follow the seuence step by step and suppose the (logical or virtual) intermidiate result table of each step.
Reply With Quote
  #9 (permalink)  
Old 09-21-11, 21:24
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
so the fact that the given query mixes old-style comma join with explicit join syntax, that doesn't matter?

i.e.

FROM a , b LEFT OUTER JOIN c
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #10 (permalink)  
Old 09-21-11, 22:27
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by r937 View Post
so the fact that the given query mixes old-style comma join with explicit join syntax, that doesn't matter?

i.e.

FROM a , b LEFT OUTER JOIN c
I think semantically it amounts to
Code:
...FROM a INNER JOIN b LEFT OUTER JOIN c
which in this case means the same thing, so DB2 can choose any order of processing the joins.
Reply With Quote
  #11 (permalink)  
Old 09-22-11, 01:16
rahulvats2007 rahulvats2007 is offline
Registered User
 
Join Date: Sep 2011
Posts: 3
Quote:
Originally Posted by tonkuma View Post
You can see an example of the importance of the understanding of the (logical) processing sequence in
alais name in the group by clause

Another effective use of the logical processing sequence may be designing/uderstanding of complex queries.
It must be great help to follow the seuence step by step and suppose the (logical or virtual) intermidiate result table of each step.
Thanks for your reply. Could you please let me know what would be the logical processing sequence of my posted query. The aim of my question was not on how DB2 optimizes the query. I want to know how different joins would be applied to my posted query to give the Final result as I have to change the query and add more join conditions.
Reply With Quote
  #12 (permalink)  
Old 09-22-11, 01:17
rahulvats2007 rahulvats2007 is offline
Registered User
 
Join Date: Sep 2011
Posts: 3
Quote:
Originally Posted by ARWinner View Post
My response was to the original post, not your reply. My experience has shown that DB2 will pick whatever access plan it deems is the fastest bast on the statistics and available indexes. A lot of times, how I expect the data to be read and how DB2 actually reads the data are totally different. One cannot just say that with a particular type of query, DB2 will always use a specific access plan. It is more complicated than that.

Andy
Thanks for your reply. Could you please let me know what would be the logical processing sequence of my posted query. The aim of my question was not on how DB2 optimizes the query. I want to know how different joins would be applied to my posted query to give the Final result as I have to change the query and add more join conditions to it.
Reply With Quote
  #13 (permalink)  
Old 09-22-11, 02:19
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
n_i already answered your case
which is the query mixes old-style comma join with explicit join syntax and asked by r937.
Reply With Quote
  #14 (permalink)  
Old 09-22-11, 02:59
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
Your query looks like...
FROM tmib_mbr_reimb tmib1 , tsld_bch_clm_det LEFT OUTER JOIN tibp_iss_bch_part ON ... WHERE ...

Seeing syntax diagrams and descriptions of from-clause, table-reference and joined-table,
I think it is natural to assum the query is equivalent to

FROM tmib_mbr_reimb tmib1 CROSS JOIN (tsld_bch_clm_det LEFT OUTER JOIN tibp_iss_bch_part ON ...) WHERE ...

that means...
1) join tsld_bch_clm_det and tibp_iss_bch_part result R1
2) cross join(Cartesian product) tmib_mbr_reimb tmib1 and R1 result R2
3) apply WHERE condition to R2 and an intermediate result table consists the rows from R2 and the condition is true.

Last edited by tonkuma; 09-22-11 at 03:04.
Reply With Quote
Reply

Tags
joins db2 concept

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On