Results 1 to 14 of 14
  1. #1
    Join Date
    Apr 2010
    Posts
    3

    Unanswered: Slow performance of IN clause

    Hello,

    I wrote an application which gets information from an DB2 (readonly access). The user can specify parameters on the UI and gets the informations. Therefor I have to do some requests with IN clauses, sometimes with about 1,000 elements (specified by the application). The performance of this requests is slow (execution time: about 75 seconds). Is there any way to speedup the request? What's about the idea of starting 1,000 'simple' requests?

    Shall I avoid IN clauses? If yes what's the alternative?

    pkoeppe

  2. #2
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    insure that the elements of the IN-LIST are sorted and duplicates are removed.

    is this dynamic sql or inbedded?
    Dick Brenholtz, Ami in Deutschland

  3. #3
    Join Date
    Apr 2010
    Posts
    3
    Duplicates are eleminated. How shall I sort string values ascending or descending?

    It's a parametrized request.

  4. #4
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    Craig Mullins says ascending-
    is the column in an index?
    Dick Brenholtz, Ami in Deutschland

  5. #5
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by dbzTHEdinosaur View Post
    insure that the elements of the IN-LIST are sorted and duplicates are removed.
    I'm almost sure DB2 will do that more efficiently; besides, it won't know the list is sorted and unique and will attempt to sort it anyway.

    Typically, any attempt to tune an SQL statement begins with analyzing its explain plan.

    Since the author chose not to disclose the plan, the statement itself, or even the DB2 version, it is difficult to give more specific advice.

  6. #6
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155

    Wink

    Quote Originally Posted by n_i View Post
    Typically, any attempt to tune an SQL statement begins with analyzing its explain plan.
    true

    Quote Originally Posted by n_i View Post
    Since the author chose not to disclose the plan, the statement itself, or even the DB2 version, it is difficult to give more specific advice.
    true

    Quote Originally Posted by n_i View Post
    I'm almost sure DB2 will do that more efficiently; besides, it won't know the list is sorted and unique and will attempt to sort it anyway.
    possibly, but if db2 only has to do one pass against the IN-LIST to determine sort and duplicates, there is less time spent by db2.

    But db2 will only sort and remove duplicates if the column is indexed; otherwise db2 will not sort and remove duplicates.

    and, leaving all the work to db2 is not always a good thing for the machine;
    but that requires one to think about other than just his sql.
    Dick Brenholtz, Ami in Deutschland

  7. #7
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Quote Originally Posted by dbzTHEdinosaur View Post
    possibly, but if db2 only has to do one pass against the IN-LIST to determine sort and duplicates, there is less time spent by db2.
    Even if there are duplicates and the list is not sorted, only a "single pass" (whatever that shall mean exactly) is needed. DB2 just sorts the list and removes duplicates as part of that if it finds any.

    But db2 will only sort and remove duplicates if the column is indexed; otherwise db2 will not sort and remove duplicates.
    You are sure about that? Because if IN-lists are sorted, DB2 may be able to convert it to a range predicate (if such an optimization is done). Also, probing each value into the IN-list can be done with binary search instead of touching every element in the list if the list is sorted. That's completely independent of the existence of an index, of course.


    I think the only advantage of removing duplicates in large IN-lists is that the SQL statement becomes shorter and less data needs to be transferred over the wire to the DB2 server, and parsing the SQL statement is slightly faster. But then, you have to have a rreally large IN-list and a lot of duplicates. Another minor possibility could be that the DB2-internal sort algorithm is faster for pre-sorted lists (e.g. like qsort, which has a complexity of O(n^2) worst case but is generally faster than O(n^log(n))). Whether such differences are measurable, is up to debate - I rather doubt that you will see differences in real-world scenarios.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  8. #8
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    Even if there are duplicates and the list is not sorted, only a "single pass" (whatever that shall mean exactly) is needed. DB2 just sorts the list and removes duplicates as part of that if it finds any
    sorry about the language problem. db2 can sort a 'sorted' list, faster than an 'unsorted list'.

    as far as the indexing comment is concerned, I can only quote Craig Mullins' advice and info.
    I assume he does not lie in his publications and have found his info to be correct in the past.

    and you have to keep my prospective in mind. I run in real-world situations which are not beefed-up with special hardware to perform BI and Warehousing DB2 functions.
    Just run on z/OS machines that have to allocate the cpu between different application needs - program utilization as well as service calls - db2, cics, mqs etc.

    if I have 10,000 requests to db2, I want to make them as streamlined as possible.
    a nano-second here, a nano-second there ....
    Dick Brenholtz, Ami in Deutschland

  9. #9
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Quote Originally Posted by dbzTHEdinosaur View Post
    sorry about the language problem. db2 can sort a 'sorted' list, faster than an 'unsorted list'.

    as far as the indexing comment is concerned, I can only quote Craig Mullins' advice and info.
    I assume he does not lie in his publications and have found his info to be correct in the past.

    and you have to keep my prospective in mind. I run in real-world situations which are not beefed-up with special hardware to perform BI and Warehousing DB2 functions.
    Just run on z/OS machines that have to allocate the cpu between different application needs - program utilization as well as service calls - db2, cics, mqs etc.

    if I have 10,000 requests to db2, I want to make them as streamlined as possible.
    a nano-second here, a nano-second there ....
    I believe that Craig Mullins works/writes mostly in the z/OS arena, and you mentioned z/OS above. The OP poster did not mention which OS he is working on for the DB2 server, but about 95% of the questions in this forum are for LUW.

    I do believe that a large "IN" list has been a performance problem in the past for DB2 LUW, but not sure about the latest release (9.7).
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  10. #10
    Join Date
    Jul 2009
    Location
    NY
    Posts
    963
    Put all values from the IN to some (temporary) table and replace IN by JOIN, using "="....

    Lenny

  11. #11
    Join Date
    Apr 2010
    Posts
    3
    Thanks for your replies until now.

    The database version is DB2/6000 8.2 and isn't administrated by myself. Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario? And how can I see the explain plan?

    @dbzTHEdinosaur: the column used in the IN clause is in an index.

    pkoeppe
    Last edited by pkoeppe; 05-04-10 at 03:17.

  12. #12
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by pkoeppe View Post
    Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario?
    Temporary tables are created in temporary tablespaces. Check out DECLARE GLOBAL TEMPORARY TABLE in the manual.


    Quote Originally Posted by pkoeppe View Post
    And how can I see the explain plan?
    If you have DB2 GUI tools installed, you could use Visual Explain. If not, the simplest way would be to run the db2expln command line utility.

  13. #13
    Join Date
    Jul 2009
    Location
    NY
    Posts
    963

    Exclamation There are many ways to improve your query

    Quote Originally Posted by pkoeppe View Post
    Thanks for your replies until now.

    The database version is DB2/6000 8.2 and isn't administrated by myself. Also I only have readonly access to a specific tablespace. How can I create a temporary table in this scenario? And how can I see the explain plan?

    @dbzTHEdinosaur: the column used in the IN clause is in an index.

    pkoeppe
    Also you can divide you IN clause on few INs.
    Example:


    Code:
    bla_id in (2,3,5,6,12,19)
    <=>
    Code:
    bla_id in (2,3)
    or
    bla_id in (5,6)
    or
    bla_id in (12,19)
    Also you can use UNION ALL instead of OR in same way.

    Lenny

  14. #14
    Join Date
    Jul 2009
    Location
    NY
    Posts
    963
    Also, instead of IN you can use = ANY, or = SOME, or EXISTS.
    It could also increase performance of the query....

    Lenny

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •