If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Slow performance of IN clause

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-30-10, 02:17
pkoeppe pkoeppe is offline
Registered User
 
Join Date: Apr 2010
Posts: 3
Slow performance of IN clause

Hello,

I wrote an application which gets information from an DB2 (readonly access). The user can specify parameters on the UI and gets the informations. Therefor I have to do some requests with IN clauses, sometimes with about 1,000 elements (specified by the application). The performance of this requests is slow (execution time: about 75 seconds). Is there any way to speedup the request? What's about the idea of starting 1,000 'simple' requests?

Shall I avoid IN clauses? If yes what's the alternative?

pkoeppe
Reply With Quote
  #2 (permalink)  
Old 04-30-10, 02:34
dbzTHEdinosaur dbzTHEdinosaur is offline
Registered User
 
Join Date: Jun 2007
Location: germany
Posts: 96
insure that the elements of the IN-LIST are sorted and duplicates are removed.

is this dynamic sql or inbedded?
__________________
Dick Brenholtz, Ami in Deutschland
Reply With Quote
  #3 (permalink)  
Old 04-30-10, 02:58
pkoeppe pkoeppe is offline
Registered User
 
Join Date: Apr 2010
Posts: 3
Duplicates are eleminated. How shall I sort string values ascending or descending?

It's a parametrized request.
Reply With Quote
  #4 (permalink)  
Old 04-30-10, 03:09
dbzTHEdinosaur dbzTHEdinosaur is offline
Registered User
 
Join Date: Jun 2007
Location: germany
Posts: 96
Craig Mullins says ascending-
is the column in an index?
__________________
Dick Brenholtz, Ami in Deutschland
Reply With Quote
  #5 (permalink)  
Old 04-30-10, 08:47
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by dbzTHEdinosaur View Post
insure that the elements of the IN-LIST are sorted and duplicates are removed.
I'm almost sure DB2 will do that more efficiently; besides, it won't know the list is sorted and unique and will attempt to sort it anyway.

Typically, any attempt to tune an SQL statement begins with analyzing its explain plan.

Since the author chose not to disclose the plan, the statement itself, or even the DB2 version, it is difficult to give more specific advice.
Reply With Quote
  #6 (permalink)  
Old 04-30-10, 09:07
dbzTHEdinosaur dbzTHEdinosaur is offline
Registered User
 
Join Date: Jun 2007
Location: germany
Posts: 96
Wink

Quote:
Originally Posted by n_i View Post
Typically, any attempt to tune an SQL statement begins with analyzing its explain plan.
true

Quote:
Originally Posted by n_i View Post
Since the author chose not to disclose the plan, the statement itself, or even the DB2 version, it is difficult to give more specific advice.
true

Quote:
Originally Posted by n_i View Post
I'm almost sure DB2 will do that more efficiently; besides, it won't know the list is sorted and unique and will attempt to sort it anyway.
possibly, but if db2 only has to do one pass against the IN-LIST to determine sort and duplicates, there is less time spent by db2.

But db2 will only sort and remove duplicates if the column is indexed; otherwise db2 will not sort and remove duplicates.

and, leaving all the work to db2 is not always a good thing for the machine;
but that requires one to think about other than just his sql.
__________________
Dick Brenholtz, Ami in Deutschland
Reply With Quote
  #7 (permalink)  
Old 04-30-10, 09:22
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Quote:
Originally Posted by dbzTHEdinosaur View Post
possibly, but if db2 only has to do one pass against the IN-LIST to determine sort and duplicates, there is less time spent by db2.
Even if there are duplicates and the list is not sorted, only a "single pass" (whatever that shall mean exactly) is needed. DB2 just sorts the list and removes duplicates as part of that if it finds any.

Quote:
But db2 will only sort and remove duplicates if the column is indexed; otherwise db2 will not sort and remove duplicates.
You are sure about that? Because if IN-lists are sorted, DB2 may be able to convert it to a range predicate (if such an optimization is done). Also, probing each value into the IN-list can be done with binary search instead of touching every element in the list if the list is sorted. That's completely independent of the existence of an index, of course.


I think the only advantage of removing duplicates in large IN-lists is that the SQL statement becomes shorter and less data needs to be transferred over the wire to the DB2 server, and parsing the SQL statement is slightly faster. But then, you have to have a rreally large IN-list and a lot of duplicates. Another minor possibility could be that the DB2-internal sort algorithm is faster for pre-sorted lists (e.g. like qsort, which has a complexity of O(n^2) worst case but is generally faster than O(n^log(n))). Whether such differences are measurable, is up to debate - I rather doubt that you will see differences in real-world scenarios.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #8 (permalink)  
Old 04-30-10, 09:40
dbzTHEdinosaur dbzTHEdinosaur is offline
Registered User
 
Join Date: Jun 2007
Location: germany
Posts: 96
Quote:
Even if there are duplicates and the list is not sorted, only a "single pass" (whatever that shall mean exactly) is needed. DB2 just sorts the list and removes duplicates as part of that if it finds any
sorry about the language problem. db2 can sort a 'sorted' list, faster than an 'unsorted list'.

as far as the indexing comment is concerned, I can only quote Craig Mullins' advice and info.
I assume he does not lie in his publications and have found his info to be correct in the past.

and you have to keep my prospective in mind. I run in real-world situations which are not beefed-up with special hardware to perform BI and Warehousing DB2 functions.
Just run on z/OS machines that have to allocate the cpu between different application needs - program utilization as well as service calls - db2, cics, mqs etc.

if I have 10,000 requests to db2, I want to make them as streamlined as possible.
a nano-second here, a nano-second there ....
__________________
Dick Brenholtz, Ami in Deutschland
Reply With Quote
  #9 (permalink)  
Old 04-30-10, 11:06
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Quote:
Originally Posted by dbzTHEdinosaur View Post
sorry about the language problem. db2 can sort a 'sorted' list, faster than an 'unsorted list'.

as far as the indexing comment is concerned, I can only quote Craig Mullins' advice and info.
I assume he does not lie in his publications and have found his info to be correct in the past.

and you have to keep my prospective in mind. I run in real-world situations which are not beefed-up with special hardware to perform BI and Warehousing DB2 functions.
Just run on z/OS machines that have to allocate the cpu between different application needs - program utilization as well as service calls - db2, cics, mqs etc.

if I have 10,000 requests to db2, I want to make them as streamlined as possible.
a nano-second here, a nano-second there ....
I believe that Craig Mullins works/writes mostly in the z/OS arena, and you mentioned z/OS above. The OP poster did not mention which OS he is working on for the DB2 server, but about 95% of the questions in this forum are for LUW.

I do believe that a large "IN" list has been a performance problem in the past for DB2 LUW, but not sure about the latest release (9.7).
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #10 (permalink)  
Old 04-30-10, 11:41
Lenny77 Lenny77 is offline
Registered User
 
Join Date: Jul 2009
Location: NY
Posts: 886
Put all values from the IN to some (temporary) table and replace IN by JOIN, using "="....

Lenny
Reply With Quote
  #11 (permalink)  
Old 05-04-10, 01:40
pkoeppe pkoeppe is offline
Registered User
 
Join Date: Apr 2010
Posts: 3
Thanks for your replies until now.

The database version is DB2/6000 8.2 and isn't administrated by myself. Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario? And how can I see the explain plan?

@dbzTHEdinosaur: the column used in the IN clause is in an index.

pkoeppe

Last edited by pkoeppe; 05-04-10 at 02:17.
Reply With Quote
  #12 (permalink)  
Old 05-04-10, 07:43
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by pkoeppe View Post
Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario?
Temporary tables are created in temporary tablespaces. Check out DECLARE GLOBAL TEMPORARY TABLE in the manual.


Quote:
Originally Posted by pkoeppe View Post
And how can I see the explain plan?
If you have DB2 GUI tools installed, you could use Visual Explain. If not, the simplest way would be to run the db2expln command line utility.
Reply With Quote
  #13 (permalink)  
Old 05-04-10, 10:04
Lenny77 Lenny77 is offline
Registered User
 
Join Date: Jul 2009
Location: NY
Posts: 886
Exclamation There are many ways to improve your query

Quote:
Originally Posted by pkoeppe View Post
Thanks for your replies until now.

The database version is DB2/6000 8.2 and isn't administrated by myself. Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario? And how can I see the explain plan?

@dbzTHEdinosaur: the column used in the IN clause is in an index.

pkoeppe
Also you can divide you IN clause on few INs.
Example:


Code:
bla_id in (2,3,5,6,12,19)
<=>
Code:
bla_id in (2,3)
or
bla_id in (5,6)
or
bla_id in (12,19)
Also you can use UNION ALL instead of OR in same way.

Lenny
Reply With Quote
  #14 (permalink)  
Old 05-04-10, 11:17
Lenny77 Lenny77 is offline
Registered User
 
Join Date: Jul 2009
Location: NY
Posts: 886
Also, instead of IN you can use = ANY, or = SOME, or EXISTS.
It could also increase performance of the query....

Lenny
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On