Results 1 to 13 of 13
  1. #1
    Join Date
    Jan 2010
    Posts
    3

    Unanswered: Db2 Query optimization to reduce cycle time

    Hi Friends,
    Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:

    Code:
    select 
    	JRF.REF_ORIGIN,
    	
    	ASI.JRNLREF_CLUSTER,
    	ASI.ISSUENO,
    	ASI.ITEMNO,
    	
    	ASA.AUTHOR,
    	
    	AUS.LAST_NAME,
    	AUS.FIRST_NAME,
    	AUS.PATRONYM,
    	
    	ASI.TITLE20,
    	
    	PUB.FULLTITLE,
    	PUB.ISOTITLE,
    	
    	ASI.START_PAGE,
    	ASI.VOLUME,
    	ASI.PUBYEAR,
    	
    	JRF.ARTICLENO,
    	JRF.DOI 
    
    FROM	INST1.ALLSOURCES_ITEM ASI,
    	INST1.CLUSTER_CONTROL CC,
    	INST1.JOURNAL_REFITEM JRF,
    	INST1.XCITATION XC,
    	INST1.ISSUE ISS,
    	INST1.PUBLICATION PUB,
    	INST1.ALLSOURCES_AUTHSHIP ASA,
    	INST1.AUTHORSHIP_SUPL AUS 
    
    WHERE (PUB.PUBLSEQ LIKE '%%%%%J') AND 
          (ASI.PUBYEAR > 1899) AND 
          ((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND 
          (ISS.DELETE_STATUS <> 'D') AND 
          (ASI.START_PAGE NOT LIKE 'NIL%') AND 
          (ASI.ISSUENO = ISS.ISSUENO) AND 
          (ASI.ITEMNO = ASA.ITEMNO) AND 
          (ASA.ITEMNO = AUS.ITEMNO) AND 
          (ASI.ITEMNO = JRF.ITEMNO) AND 
          (ASI.ITEMNO = XC.ITEMNO) AND 
          (ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER) AND 
          (ISS.PUBLSEQ = PUB.PUBLSEQ)

  2. #2
    Join Date
    Jun 2008
    Location
    Chennai
    Posts
    21
    try the following code

    select
    JRF.REF_ORIGIN,

    ASI.JRNLREF_CLUSTER,
    ASI.ISSUENO,
    ASI.ITEMNO,

    ASA.AUTHOR,

    AUS.LAST_NAME,
    AUS.FIRST_NAME,
    AUS.PATRONYM,

    ASI.TITLE20,

    PUB.FULLTITLE,
    PUB.ISOTITLE,

    ASI.START_PAGE,
    ASI.VOLUME,
    ASI.PUBYEAR,

    JRF.ARTICLENO,
    JRF.DOI


    FROM INST1.ALLSOURCES_ITEM ASI

    INNER JOIN INST1.ISSUE ISS
    ON ASI.ISSUENO = ISS.ISSUENO

    INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
    ON ASI.ITEMNO = ASA.ITEMNO

    INNER JOIN INST1.AUTHORSHIP_SUPL AUS
    ON ASA.ITEMNO = AUS.ITEMNO

    INNER JOIN INST1.JOURNAL_REFITEM JRF
    ON ASI.ITEMNO = JRF.ITEMNO

    INNER JOIN INST1.XCITATION XC
    ON ASI.ITEMNO = XC.ITEMNO

    INNER JOIN INST1.CLUSTER_CONTROL CC
    ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER

    INNER JOIN INST1.PUBLICATION PUB
    ON ISS.PUBLSEQ = PUB.PUBLSEQ


    WHERE
    (PUB.PUBLSEQ LIKE '&#37;%%%%J') AND
    (ASI.PUBYEAR > 1899) AND
    ((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
    (ISS.DELETE_STATUS <> 'D') AND
    (ASI.START_PAGE NOT LIKE 'NIL%')

  3. #3
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by agb View Post
    Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:
    You should look at the query explain plan and dynamic SQL snapshot (if the query is dynamic) to identify improvement opportunities.
    ---
    "It does not work" is not a valid problem statement.

  4. #4
    Join Date
    Jan 2010
    Posts
    3
    it is taking much time again.

  5. #5
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    1) Your SELECT list doesn't include columns of
    INST1.CLUSTER_CONTROL CC
    INST1.XCITATION XC
    INST1.ISSUE ISS

    So, referencing to "INST1.CLUSTER_CONTROL CC"
    and "INST1.XCITATION XC" might be replaced by EXISTS predicates,
    if duplicate results could be neglected.
    ("INST1.ISSUE ISS" would be necessary to join with "INST1.PUBLICATION PUB")

    2) Is an index used for the following predicate?
    ((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15)

    You can try something like this.....
    (I don't like redundant parentheses. I don't think that will improve readability.)
    (Changing sequence of predicates are for readability. Not expected performance improvement.)
    Code:
    SELECT 
           JRF.REF_ORIGIN,
           ASI.JRNLREF_CLUSTER,
           ASI.ISSUENO,
           ASI.ITEMNO,
           ASA.AUTHOR,
           AUS.LAST_NAME,
           AUS.FIRST_NAME,
           AUS.PATRONYM,
           ASI.TITLE20,
           PUB.FULLTITLE,
           PUB.ISOTITLE,
           ASI.START_PAGE,
           ASI.VOLUME,
           ASI.PUBYEAR,
           JRF.ARTICLENO,
           JRF.DOI 
      FROM INST1.ALLSOURCES_ITEM     ASI
         , INST1.JOURNAL_REFITEM     JRF
         , INST1.ISSUE               ISS
         , INST1.PUBLICATION         PUB
         , INST1.ALLSOURCES_AUTHSHIP ASA
         , INST1.AUTHORSHIP_SUPL     AUS
     WHERE ASI.START_PAGE    NOT LIKE 'NIL&#37;'
       AND ASI.PUBYEAR       >  1899
       AND ASI.ITEMNO        =  JRF.ITEMNO
       AND ASI.ISSUENO       =  ISS.ISSUENO
       AND ISS.DELETE_STATUS <> 'D'
       AND ISS.PUBLSEQ       =  PUB.PUBLSEQ
       AND PUB.PUBLSEQ       LIKE '%%%%%J'
       AND ASI.ITEMNO        =  ASA.ITEMNO
       AND ASA.ITEMNO        =  AUS.ITEMNO
       AND EXISTS
           (SELECT * 
              FROM INST1.CLUSTER_CONTROL CC
             WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
               AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
           )
       AND EXISTS
           (SELECT *
              FROM INST1.XCITATION XC
             WHERE XC.ITEMNO = ASI.ITEMNO
           )
    Last edited by tonkuma; 01-06-10 at 09:01.

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    If some of ASI.ITEMNO, JRF.ITEMNO, ASA.ITEMNO or AUS.ITEMNO are unique,
    adding constraint UNIQUE(or CREATE UNIQUE index) on unique valued column(s) may let DB2 some more optimization.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    1) ASI, ISS and PUB are joined each other like this:
    ALLSOURCES_ITEM ASI <---(ASI.ISSUENO = ISS.ISSUENO)---> ISSUE ISS <---(ISS.PUBLSEQ = PUB.PUBLSEQ)---> PUBLICATION PUB
    and all three tables have some restrictive predicates.

    2) JRF, ASA and AUS are joined with ASI by ITEMNO
    Although, AUS is joined with ASA by ITEMNO, there is a condition "ASA.ITEMNO = ASI.ITEMNO".
    So, following equation is true.
    AUS.ITEMNO = ASA.ITEMNO = ASI.ITEMNO
    then AUS.ITEMNO = ASI.ITEMNO


    Here is a query considering to join ASI, ISS and PUB first, then to join JRF, ASA and AUS with ASI.
    (It might be ineffectual with query rewrite mechanism of DB2 optimizer.)
    Code:
    SELECT 
           JRF.    REF_ORIGIN,
           ASI_PUB.JRNLREF_CLUSTER,
           ASI_PUB.ISSUENO,
           ASI_PUB.ITEMNO,
           ASA.    AUTHOR,
           AUS.    LAST_NAME,
           AUS.    FIRST_NAME,
           AUS.    PATRONYM,
           ASI_PUB.TITLE20,
           ASI_PUB.FULLTITLE,
           ASI_PUB.ISOTITLE,
           ASI_PUB.START_PAGE,
           ASI_PUB.VOLUME,
           ASI_PUB.PUBYEAR,
           JRF.    ARTICLENO,
           JRF.    DOI 
      FROM (SELECT DISTINCT
                   ASI.JRNLREF_CLUSTER,
                   ASI.ISSUENO,
                   ASI.ITEMNO,
                   ASI.TITLE20,
                   ASI.START_PAGE,
                   ASI.VOLUME,
                   ASI.PUBYEAR,
                   PUB.FULLTITLE,
                   PUB.ISOTITLE
              FROM INST1.ALLSOURCES_ITEM     ASI
                 , INST1.ISSUE               ISS
                 , INST1.PUBLICATION         PUB
             WHERE
                   ASI.START_PAGE    NOT LIKE 'NIL%'
               AND ASI.PUBYEAR       >  1899
               AND ASI.ISSUENO       =  ISS.ISSUENO
               AND ISS.DELETE_STATUS <> 'D'
               AND ISS.PUBLSEQ       =  PUB.PUBLSEQ
               AND PUB.PUBLSEQ       LIKE '%%%%%J'
               AND EXISTS
                   (SELECT * 
                      FROM INST1.CLUSTER_CONTROL CC
                     WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
                       AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
                   )
               AND EXISTS
                   (SELECT *
                      FROM INST1.XCITATION XC
                     WHERE XC.ITEMNO = ASI.ITEMNO
                   )
           ) ASI_PUB
         , INST1.JOURNAL_REFITEM     JRF
         , INST1.ALLSOURCES_AUTHSHIP ASA
         , INST1.AUTHORSHIP_SUPL     AUS
     WHERE
           ASI_PUB.ITEMNO = JRF.ITEMNO
       AND ASI_PUB.ITEMNO = ASA.ITEMNO
       AND ASI_PUB.ITEMNO = AUS.ITEMNO
    ;

  8. #8
    Join Date
    Jan 2010
    Posts
    3
    Quote Originally Posted by kandrusatish View Post
    try the following code

    select
    JRF.REF_ORIGIN,

    ASI.JRNLREF_CLUSTER,
    ASI.ISSUENO,
    ASI.ITEMNO,

    ASA.AUTHOR,

    AUS.LAST_NAME,
    AUS.FIRST_NAME,
    AUS.PATRONYM,

    ASI.TITLE20,

    PUB.FULLTITLE,
    PUB.ISOTITLE,

    ASI.START_PAGE,
    ASI.VOLUME,
    ASI.PUBYEAR,

    JRF.ARTICLENO,
    JRF.DOI


    FROM INST1.ALLSOURCES_ITEM ASI

    INNER JOIN INST1.ISSUE ISS
    ON ASI.ISSUENO = ISS.ISSUENO

    INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
    ON ASI.ITEMNO = ASA.ITEMNO

    INNER JOIN INST1.AUTHORSHIP_SUPL AUS
    ON ASA.ITEMNO = AUS.ITEMNO

    INNER JOIN INST1.JOURNAL_REFITEM JRF
    ON ASI.ITEMNO = JRF.ITEMNO

    INNER JOIN INST1.XCITATION XC
    ON ASI.ITEMNO = XC.ITEMNO

    INNER JOIN INST1.CLUSTER_CONTROL CC
    ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER

    INNER JOIN INST1.PUBLICATION PUB
    ON ISS.PUBLSEQ = PUB.PUBLSEQ


    WHERE
    (PUB.PUBLSEQ LIKE '&#37;%%%%J') AND
    (ASI.PUBYEAR > 1899) AND
    ((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
    (ISS.DELETE_STATUS <> 'D') AND
    (ASI.START_PAGE NOT LIKE 'NIL%')
    Hi Friend,
    Still it is taking so much time. is any better way we can optimize.

  9. #9
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    I guess you have the necessary indexes on the join columns and also the columns used in the WHERE clause? Have you collected the latest statistics, including column distribution?

    Have you tried creating an MQT? Since this is a more complex query, I assume we have more of a warehouse instead an OLTP application and, thus, can tolerate the additional overhead for maintaining MQTs. If you do have an OLTP-style application, you may want to work on the table design - joining 8 tables will take its time. So you can partition the tables horizontally or vertically to reduce the data volume to be scanned, for instance. Another option is to reorganize the table in such a way that the typically accessed data is close to each other in terms of placement on the data pages, i.e. improve the benefits of prefetching.

    But all this are the usual performance tuning things - without knowing the access plan and typical workload, one cannot be more specific. For example, if this is just one long-running query and you have n other queries that are just fine, changing the database structure may not be the best move...
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  10. #10
    Join Date
    Jun 2008
    Location
    Chennai
    Posts
    21
    Hi Agb,

    actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.

    the query need to be tuned to avoid the full table scan ... and that's why i'm unable to give any idea unless doing some research on indexes for the above query using the query execution plan

    and also i have adoubt on using the statement PUB.PUBLSEQ LIKE '&#37;%%%%J'

    if you are trying to find all the 6 character strings having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '_____J' (replace '%' with '_')

    or if you are trying to find all the strings of any length and having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '%J'

    Please change it accordingly ...

  11. #11
    Join Date
    Jun 2008
    Location
    Chennai
    Posts
    21
    sorry agb, i don't know about the usage of wildcard characters for LIKE operator in DB2. I raised the above doubt in the perspective of MS SQL server

  12. #12
    Join Date
    Jan 2009
    Location
    Zoetermeer, Holland
    Posts
    746
    Quote Originally Posted by kandrusatish View Post
    actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.
    Did you pay special attention to the table/columns in your RUNSTATS? Did you use the "like-statistics" option? Might help

  13. #13
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Hi Friend,
    Still it is taking so much time. is any better way we can optimize.
    agb,
    I already gave you three suggestions.
    If they were not effective, please inform me.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •