If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Does Inner Query pulls All the records?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-13-05, 21:59
palchuri palchuri is offline
Registered User
 
Join Date: Jan 2003
Posts: 47
Does Inner Query pulls All the records?

I am having the following sql

update tableB
set columnb='a'
where columna in(select columna from tableA)

tableA contains 2 Million rows and tableB contains 2000 rows. By executing the sql does the inner query fetches all 2 Million rows? Is there any way I can verify?

Thanks
Shri
__________________
p.srinivasarao
Reply With Quote
  #2 (permalink)  
Old 10-14-05, 02:40
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Yes it does.

You may want this:

update tableB B
set columnb= 'a'
where columna in
(select columna from tableA A where A. columna = B.columna)

or maybe this:

update tableB
set columnb= 'a'
where columna in (select columna from tableA fetch first 1 rows only)

or maybe this (but it will read all rows in tableA):

update tableB
set columnb= 'a'
where columna in (select distinct(columna) from tableA)
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390

Last edited by Marcus_A; 10-14-05 at 03:00.
Reply With Quote
  #3 (permalink)  
Old 10-14-05, 07:01
palchuri palchuri is offline
Registered User
 
Join Date: Jan 2003
Posts: 47
Thanks Marcus
__________________
p.srinivasarao
Reply With Quote
  #4 (permalink)  
Old 10-15-05, 09:26
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
Quote:
Originally Posted by Marcus_A
update tableB
set columnb= 'a'
where columna in (select distinct(columna) from tableA)
This is logically equivalent to the original query, but will in general be less performant.
(Correct me if I'm wrong.)
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
Reply With Quote
  #5 (permalink)  
Old 10-15-05, 09:29
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
If tableA contains an index on columna, most likely the inner query will use an index-only lookup; i.e., the table will not be scanned at all.
You may verify this by using EXPLAIN.
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
Reply With Quote
  #6 (permalink)  
Old 10-15-05, 10:25
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Quote:
Originally Posted by Peter.Vanroose
If tableA contains an index on columna, most likely the inner query will use an index-only lookup; i.e., the table will not be scanned at all.
You may verify this by using EXPLAIN.
The original query makes little sense without the DISTINCT clause, and even then it may not be what was intended. A DB2 query with a very long "IN" list will not perform well, and in the past there have been limits on how may values can there can be in such a clause.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390

Last edited by Marcus_A; 10-15-05 at 10:44.
Reply With Quote
  #7 (permalink)  
Old 10-15-05, 10:40
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
I would think a good optimizer would be able to find out by itself whether it reorganizes a long "IN" list or not ...

Anybody did some experiments regarding performance of long IN lists?
Any differences between DB2 7 and 8 in this respect?
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/

Last edited by Peter.Vanroose; 10-15-05 at 15:50.
Reply With Quote
  #8 (permalink)  
Old 10-15-05, 15:30
Wim Wim is offline
Registered User
 
Join Date: Nov 2004
Posts: 1,279
I would write
Code:
update tableB B
set columnb= 'a'
where EXISTS 
(select 1 from tableA A where A.columna = B.columna)
Would the performance be better or worse than using the IN clause ?
__________________
With kind regards . . . . . SQL Server 2000/2005/2008/2008 R2 Earned beers: 16
Wim
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages
Reply With Quote
  #9 (permalink)  
Old 10-15-05, 15:49
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
Quote:
Originally Posted by Wim
update tableB B
set columnb= 'a'
where EXISTS
(select 1 from tableA A where A.columna = B.columna)
As compared to the logically equivalent
Code:
update tableB
set columnb= 'a'
where columna IN (select columna from tableA)
the "EXISTS" solution requires running the subquery for every of the 1000 rows of the outer tableA, but on the other hand these 1000 subquery runs each return just 1 value (true or false), while the single subquery run in the "IN" solution may return 1000000 values.
Unless the optimizer is clever enough, I would vote for the "EXISTS" query as being the fastest one. (Roughly 1000 times faster?? Anybody did the experiment?)
All will of course also depend on the presence of indices.
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/

Last edited by Peter.Vanroose; 10-15-05 at 16:59.
Reply With Quote
  #10 (permalink)  
Old 10-15-05, 16:32
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
The problem is that the second query is not functionally the same as the first. The second query says that any value in the tableA is valid for any row in tableB, whereas the first query correlates the query to the equivlent row in tableA which have the same value for columna (WHERE A.columna = B.columna).
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #11 (permalink)  
Old 10-15-05, 17:06
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
The two queries certainly are equivalent!
The first one will change those rows of TableB for which that row's columna value exists in columna of TableA, while the second one will change those rows of TableB for which that row's columna is in the list of columna values of TableA ...
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
Reply With Quote
  #12 (permalink)  
Old 10-17-05, 10:31
test test is offline
Registered User
 
Join Date: Oct 2001
Posts: 46
Quote:
Originally Posted by Peter.Vanroose
The two queries certainly are equivalent!
The first one will change those rows of TableB for which that row's columna value exists in columna of TableA, while the second one will change those rows of TableB for which that row's columna is in the list of columna values of TableA ...
Isn't that like saying "take two steps forward, one step back" is logically equivalent to walking one step at a time? Afterall, the net gain is one step forward!
Reply With Quote
  #13 (permalink)  
Old 10-17-05, 11:35
Peter.Vanroose Peter.Vanroose is offline
Registered User
 
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
Quote:
Originally Posted by test
Isn't that like saying "take two steps forward, one step back" is logically equivalent to walking one step at a time? Afterall, the net gain is one step forward!
In this case, not necessarily:
The "IN" version might by better when the inner table (A) is small and the outer table (B) is large, while the "EXISTS" version will be likely to be more performant in the opposite situation.

So it is more like saying "which is better: two small steps forward, or one big step?" where the answer depends on whether you are climbing a mountain trail or not.
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
Reply With Quote
  #14 (permalink)  
Old 10-18-05, 00:14
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
You are probably right that the two are equivilent. It is difficult to understand what the business purpose of the queries are because they use completely fake table names and columns. It would be easier to understand if the DB2 Sample database was used to make up such queries.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On