Results 1 to 4 of 4
  1. #1
    Join Date
    Mar 2007
    Posts
    35

    Unanswered: help me to tune sql statement

    I used db2 v8.2 and aix 5.2,
    I wrote the following sql statement but the last "Not Exists" statement is bottelneck and the the result never back and the sql run for ever, could you please let me know how tune the following sql statement.

    Thanks in advance for your help.


    SELECT ac.col1
    FROM ac ac inner join lt lt
    on ( ac.col1 = lt.col1
    AND (lt.col2 = 12 or lt.col2= 19 AND lt.col3 < '2000-03-19' ))
    where NOT EXISTS (SELECT 1 FROM at at, tr tr
    WHERE ac.col1 = at.col1
    AND at.col2 = tr.col1
    AND tr.col2 in (123,127))
    AND NOT EXISTS (SELECT 1 FROM txn txn
    WHERE txn.col1 = ac.col1
    AND ( txn.col2 >= '2005-01-14'
    OR txn.col3 >= '2005-01-14' ))
    AND NOT EXISTS (SELECT 1 FROM trr c1
    WHERE c1.col1 >= '2005-01-14'
    AND (c1.col2 = ac.col1
    OR c1.col1 = ac.col2
    OR c1.col1 in (SELECT ad.col1 FROM ad ad
    where col1= ac.col2 and col2=0 )));

  2. #2
    Join Date
    Jan 2003
    Posts
    4,292
    Provided Answers: 5
    What is the DDL of all the tables and all their indexes? What is the cardinality of each table?

    Andy

  3. #3
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Besides the regular tuning stuff with indexes etc. you could try to rewrite your SQL statement to not use correlated subqueries. Using uncorrelated ones allows DB2 to compute the result for the subquery once and then do a scan over the (joined) table in the FROM clause of the outer-most fullselect and reuse those precomputed temporary results. Of course, DB2 will try to do such a rewrite, but it may not find the correct rewriting rules for various reasons.

    Code:
    SELECT  ac.col1
    FROM ac ac inner join lt lt 
    	on ( ac.col1 = lt.col1
    	     AND (lt.col2 = 12 or lt.col2= 19 AND lt.col3 < '2000-03-19' ))
    Except for the "ac.col1 = lt.col1", all other predicates are not join predicates and, therefore, should be removed from the join condition and placed into the WHERE clause. Otherwise, you just confuse matters unnecessarily.

    Code:
    where NOT EXISTS (SELECT 1 FROM at at, tr tr 
                    WHERE ac.col1 = at.col1  
                    AND at.col2 = tr.col1 
    		AND tr.col2 in (123,127))
    This is a correlated subquery. You should try to rewrite this to an uncorrelated one. Also, you use the explicit join notation in the outer select, so it would be more consistent if you use the same notation in the subselects.
    Code:
    ac.col1 NOT IN ( SELECT at.col1
                     FROM   at at
                               JOIN
                            tr tr
                               ON ( at.col2 = tr.col1 )
                     WHERE tr.col2 in ( 123, 127 ) )
    Same with the next NOT EXISTS predicate
    Code:
    ac.col1 NOT IN ( SELECT txn.col1
                     FROM   txn txn
                     WHERE  txn.col2 >= '2005-01-14' OR
                            txn.col3 >= '2005-01-14' )
    The last one is a little bit trickier because you have another nested subselect in there, which even relates back to the outer-most fullselect.

    Let's first change the nested subselect to an uncorrelated one. The condition "col1 = ac.col2" can be moved out of the nested subselect because col1 = ad.col1 = ac.col2 is implied.
    Code:
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        ( c1.col2 = ac.col1 OR
                          ( c1.col1 = ac.col2 AND
                            c1.col1 IN ( SELECT ad.col1
                                         FROM   ad ad
                                         WHERE  ad.col2 = 0 ) ) )
    Note that you can remove the 2nd condition in the WHERE clause of the (outer) subselect because "c1.col1 = ac.col2" is already covered in the nested subselect.

    Next, we can change the OR operator to a UNION ALL or even pull this out of the subselect. Since this is a NOT EXISTS, the OR would become an AND.
    Code:
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        c1.col2 = ac.col1 )
    AND
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        c1.col1 = ac.col2 AND
                        c1.col1 IN ( SELECT ad.col1
                                     FROM   ad ad
                                     WHERE  ad.col2 = 0 ) )
    Finally, use uncorrelated subqueries again:
    Code:
    ac.col1 NOT IN ( SELECT c1.col1
                     FROM   trr c1
                     WHERE  c1.col1 >= '2005-01-14' )
    AND
    ac.col2 NOT IN ( SELECT c1.col1
                     FROM   trr c1
                     WHERE  c1.col1 >= '2005-01-14' AND
                            c1.col1 IN ( SELECT ad.col1
                                         FROM   ad ad
                                         WHERE  ad.col2 = 0 ) )
    p.s: I didn't try this with any sample data, so I may have made one or mistakes during the transformations or I may not have been aware of some implicit knowledge you have.
    Last edited by stolze; 03-18-08 at 19:13.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  4. #4
    Join Date
    Mar 2007
    Posts
    35
    Quote Originally Posted by stolze
    Besides the regular tuning stuff with indexes etc. you could try to rewrite your SQL statement to not use correlated subqueries. Using uncorrelated ones allows DB2 to compute the result for the subquery once and then do a scan over the (joined) table in the FROM clause of the outer-most fullselect and reuse those precomputed temporary results. Of course, DB2 will try to do such a rewrite, but it may not find the correct rewriting rules for various reasons.



    Except for the "ac.col1 = lt.col1", all other predicates are not join predicates and, therefore, should be removed from the join condition and placed into the WHERE clause. Otherwise, you just confuse matters unnecessarily.



    This is a correlated subquery. You should try to rewrite this to an uncorrelated one. Also, you use the explicit join notation in the outer select, so it would be more consistent if you use the same notation in the subselects.
    Code:
    ac.col1 NOT IN ( SELECT at.col1
                     FROM   at at
                               JOIN
                            tr tr
                               ON ( at.col2 = tr.col1 )
                     WHERE tr.col2 in ( 123, 127 ) )
    Same with the next NOT EXISTS predicate
    Code:
    ac.col1 NOT IN ( SELECT txn.col1
                     FROM   txn txn
                     WHERE  txn.col2 >= '2005-01-14' OR
                            txn.col3 >= '2005-01-14' )
    The last one is a little bit trickier because you have another nested subselect in there, which even relates back to the outer-most fullselect.

    Let's first change the nested subselect to an uncorrelated one. The condition "col1 = ac.col2" can be moved out of the nested subselect because col1 = ad.col1 = ac.col2 is implied.
    Code:
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        ( c1.col2 = ac.col1 OR
                          ( c1.col1 = ac.col2 AND
                            c1.col1 IN ( SELECT ad.col1
                                         FROM   ad ad
                                         WHERE  ad.col2 = 0 ) ) )
    Note that you can remove the 2nd condition in the WHERE clause of the (outer) subselect because "c1.col1 = ac.col2" is already covered in the nested subselect.

    Next, we can change the OR operator to a UNION ALL or even pull this out of the subselect. Since this is a NOT EXISTS, the OR would become an AND.
    Code:
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        c1.col2 = ac.col1 )
    AND
    NOT EXISTS ( SELECT 1
                 FROM   trr c1
                 WHERE  c1.col1 >= '2005-01-14' AND
                        c1.col1 = ac.col2 AND
                        c1.col1 IN ( SELECT ad.col1
                                     FROM   ad ad
                                     WHERE  ad.col2 = 0 ) )
    Finally, use uncorrelated subqueries again:
    Code:
    ac.col1 NOT IN ( SELECT c1.col1
                     FROM   trr c1
                     WHERE  c1.col1 >= '2005-01-14' )
    AND
    ac.col2 NOT IN ( SELECT c1.col1
                     FROM   trr c1
                     WHERE  c1.col1 >= '2005-01-14' AND
                            c1.col1 IN ( SELECT ad.col1
                                         FROM   ad ad
                                         WHERE  ad.col2 = 0 ) )
    p.s: I didn't try this with any sample data, so I may have made one or mistakes during the transformations or I may not have been aware of some implicit knowledge you have.
    Thank you so much stolze, your information was really helpful

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •