Results 1 to 9 of 9
  1. #1
    Join Date
    Mar 2012
    Posts
    2

    Unanswered: Innerjoin query executing long time

    Hi All,
    The following query is executing more than 3 hours. I tried to run this query with different options (like combining all the conditions and placed on "ON" clause or putting the where conditions in place of ON clause and vice versa). But there is no improvement. Primary key for this table is UID, CMP_ID, PROD_CD and POL_NBR. The table contains 300 million records out of which 180 million records are satisfying the WHERE clause conditions ( A.CMP_ID = '0000000110'
    AND A.PROD_CD = '512'
    AND A.REL_CD = '0'
    AND A.POL_TYP = 'P'
    ).

    Could you please provide me the approach/solution so that query can execute within an hour. Thanks in advance!!

    SELECT B.CMP_ID,
    CHAR('|'),
    B.PROD_CD,
    CHAR('|'),
    B.POL_NBR,
    CHAR('|'),
    A.UID,
    CHAR('|'),
    A.ACCT_OPN_DT,
    CHAR('|'),
    A.MNT_TS,
    CHAR('|'),
    A.MNT_OPER_ID,
    CHAR('|'),
    B.UID,
    CHAR('|'),
    B.ACCT_OPN_DT,
    CHAR('|'),
    B.MNT_TS,
    CHAR('|'),
    B.MNT_OPER_ID
    FROM DEVDB006.CMP_CUST_POL A
    INNER JOIN DEVDB006.CMP_CUST_POL B
    ON A.CMP_ID = B.CMP_ID
    AND A.PROD_CD = B.PROD_CD
    AND A.POL_NBR = B.POL_NBR
    AND A.UID <> B.UID
    WHERE A.REL_CD = B.REL_CD
    AND A.CMP_ID = '0000000110'
    AND A.PROD_CD = '512'
    AND A.REL_CD = '0'
    AND A.POL_TYP = 'P'
    WITH UR;


    Advanced thanks for your help.

    Thanks,
    Venkat.

  2. #2
    Join Date
    Nov 2011
    Posts
    334
    I think maybe DB2 has the only join method -nest loop join chosen to do the innerjoin because there is a (<>) join contition 。
    also the (<>) join contition is on your pk's leading column, so the inner table's lookup is not effeciant ( maybe the index sargable )
    But need access plan to prove it。
    You can try to change the pk colunm in the order of ( CMP_ID, PROD_CD and POL_NBR, UID )
    to see whether it will help....

  3. #3
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    In addition to the already-given suggestion to move CMP_ID in the unique constraint (PK) as last column, you may also want to include REL_CD into the index, probably as INCLUDE column. That would allow DB2 to evaluate all predicates for the join and in the WHERE clause solely on the index.

    You can also consider to include columns UID, ACCT_OPN_DT, MNT_TS, and MNT_OPER_ID in the index to haveth potential for an index-only access plan.

    Finally, get rid of all the CHAR('|') between the columns. This makes the result set unnecessarily complex and increases the amount of data that the server needs to send to the client.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  4. #4
    Join Date
    Jul 2009
    Location
    NY
    Posts
    963

    Arrow changes required

    1. First of all: the result set is too huge.
    2. We'll get first 10000 rows
    3. We can CHANGE your query to get result set in a faster way:

    Code:
    SELECT B.CMP_ID, 
    CHAR('|'), 
    B.PROD_CD, 
    CHAR('|'), 
    B.POL_NBR, 
    CHAR('|'), 
    A.UID, 
    CHAR('|'), 
    A.ACCT_OPN_DT, 
    CHAR('|'), 
    A.MNT_TS, 
    CHAR('|'), 
    A.MNT_OPER_ID, 
    CHAR('|'), 
    B.UID, 
    CHAR('|'), 
    B.ACCT_OPN_DT, 
    CHAR('|'), 
    B.MNT_TS, 
    CHAR('|'), 
    B.MNT_OPER_ID 
    FROM 
    (select * 
    from
    DEVDB006.CMP_CUST_POL 
    Where
        REL_CD  = '0' 
    AND CMP_ID  = '0000000110' 
    AND PROD_CD = '512' 
    AND POL_TYP = 'P'   ) A
    JOIN Table
    (select C.* 
    from
    DEVDB006.CMP_CUST_POL C
    Where
        C.REL_CD    = '0' 
    AND C.CMP_ID    = '0000000110' 
    AND C.PROD_CD   = '512' 
    and C.POL_NBR = A.POL_NBR 
    AND C.UID    <> A.UID 
    ) B
    
    On 1 = 1
      
    Fetch First 10000 rows only
    WITH UR;
    Lenny

  5. #5
    Join Date
    Mar 2012
    Posts
    2

    Innerjoin query executing long time

    Hi All,
    Thanks for your response.
    I have tried all the solutions provided by you, still the query is running more than 3 hours.
    Lenny, thanks for the detailed query. Even for the fetch first 10000 rows it's executing more than 3 hours.

    Thanks,
    Ravi.

  6. #6
    Join Date
    Nov 2011
    Posts
    334
    plz put your sql in sqlfile and try this :
    db2batch -d <dbname> -f sqlfile -i complete -o e yes r 10 p 5
    db2exfmt -d <dbname> -1

    post the output of db2batch and db2exfmt here ....

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Consider columns of an index for your query performance by the following priority.

    (1) <column-name> = <constant> in WHERE clause
    CMP_ID , PROD_CD , REL_CD , POL_TYP

    (2) a.<column-name> = b.<column-name>
    POL_NBR

    (3) a.<column-name> <> b.<column-name>
    UID

    (4) other columns in SELECT clause
    ACCT_OPN_DT , MNT_TS , MNT_OPER_ID


    As a conseuence, please try the following index.
    Code:
    CREATE UNIQUE INDEX CMP_CUST_POL_xxx
     ON DEVDB006.CMP_CUST_POL
     (CMP_ID , PROD_CD , REL_CD , POL_TYP , POL_NBR , UID)
     INCLUDE (ACCT_OPN_DT , MNT_TS , MNT_OPER_ID)
    ;

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    ... 180 million records are satisfying WHERE clause conditions
    ( A.CMP_ID = '0000000110'
    AND A.PROD_CD = '512'
    AND A.REL_CD = '0'
    AND A.POL_TYP = 'P'
    ).
    180 million records are too big like Lenny pointed out, it's more than half of 300 million records in the table.
    Could you add some more conditions?


    Another issue is...
    Premise:
    (1) Primary key: UID , CMP_ID , PROD_CD , POL_NBR
    (2) CMP_ID and PROD_CD were eual to specific values(specified in WHERE clause).

    If number of rows of a specific A.POL_NBR (say N) was more than 1(all POL_TYP = 'P' and UIDs must be all different),
    all the rows would be also included in B. For example: UID-1 is different form UID-2, UID-2 is different from UID-1, so on...
    So number of rows in B satisfying matching conditions for a specific A.POL_NBR would be N + M.
    (M include rows which UID is not in set of A and B.POL_TYP <> 'P')
    Then the result of join would be N * (N - 1) + N * M = N * (N + M - 1)

    Could this(number of rows of a specific A.POL_NBR (say N) was more than 1) happen?
    Or, a condition B.POL_TYP <> 'P' could be added?
    Or, wasn't my inference right?
    Last edited by tonkuma; 03-12-12 at 10:18. Reason: Add "Or, a condition B.POL_TYP <> 'P' could be added?"

  9. #9
    Join Date
    Jul 2009
    Location
    NY
    Posts
    963

    Arrow You can try this one

    You can change the query a little bit:

    Code:
    Select
    .........
    .........
    .........
    FROM 
    (select * 
    from
    DEVDB006.CMP_CUST_POL 
    Where
        REL_CD  = '0' 
    AND CMP_ID  = '0000000110' 
    AND PROD_CD = '512' 
    AND POL_TYP = 'P'   
    Fetch First 1000 rows only) A
    JOIN Table
    (select C.* 
    from
    DEVDB006.CMP_CUST_POL C
    Where
        C.REL_CD    = '0' 
    AND C.CMP_ID    = '0000000110' 
    AND C.PROD_CD   = '512' 
    and C.POL_NBR = A.POL_NBR 
    AND C.UID    <> A.UID 
    Fetch First 1000 rows only) B
    
    On 1 = 1
      
    Fetch First 10000 rows only
    WITH UR;
    Lenny

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •