If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Db2 Query optimization to reduce cycle time

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-06-10, 02:17
agb agb is offline
Registered User
 
Join Date: Jan 2010
Posts: 3
Db2 Query optimization to reduce cycle time

Hi Friends,
Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:

Code:
select 
	JRF.REF_ORIGIN,
	
	ASI.JRNLREF_CLUSTER,
	ASI.ISSUENO,
	ASI.ITEMNO,
	
	ASA.AUTHOR,
	
	AUS.LAST_NAME,
	AUS.FIRST_NAME,
	AUS.PATRONYM,
	
	ASI.TITLE20,
	
	PUB.FULLTITLE,
	PUB.ISOTITLE,
	
	ASI.START_PAGE,
	ASI.VOLUME,
	ASI.PUBYEAR,
	
	JRF.ARTICLENO,
	JRF.DOI 

FROM	INST1.ALLSOURCES_ITEM ASI,
	INST1.CLUSTER_CONTROL CC,
	INST1.JOURNAL_REFITEM JRF,
	INST1.XCITATION XC,
	INST1.ISSUE ISS,
	INST1.PUBLICATION PUB,
	INST1.ALLSOURCES_AUTHSHIP ASA,
	INST1.AUTHORSHIP_SUPL AUS 

WHERE (PUB.PUBLSEQ LIKE '%%%%%J') AND 
      (ASI.PUBYEAR > 1899) AND 
      ((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND 
      (ISS.DELETE_STATUS <> 'D') AND 
      (ASI.START_PAGE NOT LIKE 'NIL%') AND 
      (ASI.ISSUENO = ISS.ISSUENO) AND 
      (ASI.ITEMNO = ASA.ITEMNO) AND 
      (ASA.ITEMNO = AUS.ITEMNO) AND 
      (ASI.ITEMNO = JRF.ITEMNO) AND 
      (ASI.ITEMNO = XC.ITEMNO) AND 
      (ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER) AND 
      (ISS.PUBLSEQ = PUB.PUBLSEQ)
Reply With Quote
  #2 (permalink)  
Old 01-06-10, 06:45
kandrusatish kandrusatish is offline
Registered User
 
Join Date: Jun 2008
Location: Chennai
Posts: 21
try the following code

select
JRF.REF_ORIGIN,

ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,

ASA.AUTHOR,

AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,

ASI.TITLE20,

PUB.FULLTITLE,
PUB.ISOTITLE,

ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,

JRF.ARTICLENO,
JRF.DOI


FROM INST1.ALLSOURCES_ITEM ASI

INNER JOIN INST1.ISSUE ISS
ON ASI.ISSUENO = ISS.ISSUENO

INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
ON ASI.ITEMNO = ASA.ITEMNO

INNER JOIN INST1.AUTHORSHIP_SUPL AUS
ON ASA.ITEMNO = AUS.ITEMNO

INNER JOIN INST1.JOURNAL_REFITEM JRF
ON ASI.ITEMNO = JRF.ITEMNO

INNER JOIN INST1.XCITATION XC
ON ASI.ITEMNO = XC.ITEMNO

INNER JOIN INST1.CLUSTER_CONTROL CC
ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER

INNER JOIN INST1.PUBLICATION PUB
ON ISS.PUBLSEQ = PUB.PUBLSEQ


WHERE
(PUB.PUBLSEQ LIKE '&#37;%%%%J') AND
(ASI.PUBYEAR > 1899) AND
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
(ISS.DELETE_STATUS <> 'D') AND
(ASI.START_PAGE NOT LIKE 'NIL%')
Reply With Quote
  #3 (permalink)  
Old 01-06-10, 07:07
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by agb View Post
Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:
You should look at the query explain plan and dynamic SQL snapshot (if the query is dynamic) to identify improvement opportunities.
Reply With Quote
  #4 (permalink)  
Old 01-06-10, 07:28
agb agb is offline
Registered User
 
Join Date: Jan 2010
Posts: 3
it is taking much time again.
Reply With Quote
  #5 (permalink)  
Old 01-06-10, 07:51
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
1) Your SELECT list doesn't include columns of
INST1.CLUSTER_CONTROL CC
INST1.XCITATION XC
INST1.ISSUE ISS

So, referencing to "INST1.CLUSTER_CONTROL CC"
and "INST1.XCITATION XC" might be replaced by EXISTS predicates,
if duplicate results could be neglected.
("INST1.ISSUE ISS" would be necessary to join with "INST1.PUBLICATION PUB")

2) Is an index used for the following predicate?
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15)

You can try something like this.....
(I don't like redundant parentheses. I don't think that will improve readability.)
(Changing sequence of predicates are for readability. Not expected performance improvement.)
Code:
SELECT 
       JRF.REF_ORIGIN,
       ASI.JRNLREF_CLUSTER,
       ASI.ISSUENO,
       ASI.ITEMNO,
       ASA.AUTHOR,
       AUS.LAST_NAME,
       AUS.FIRST_NAME,
       AUS.PATRONYM,
       ASI.TITLE20,
       PUB.FULLTITLE,
       PUB.ISOTITLE,
       ASI.START_PAGE,
       ASI.VOLUME,
       ASI.PUBYEAR,
       JRF.ARTICLENO,
       JRF.DOI 
  FROM INST1.ALLSOURCES_ITEM     ASI
     , INST1.JOURNAL_REFITEM     JRF
     , INST1.ISSUE               ISS
     , INST1.PUBLICATION         PUB
     , INST1.ALLSOURCES_AUTHSHIP ASA
     , INST1.AUTHORSHIP_SUPL     AUS
 WHERE ASI.START_PAGE    NOT LIKE 'NIL&#37;'
   AND ASI.PUBYEAR       >  1899
   AND ASI.ITEMNO        =  JRF.ITEMNO
   AND ASI.ISSUENO       =  ISS.ISSUENO
   AND ISS.DELETE_STATUS <> 'D'
   AND ISS.PUBLSEQ       =  PUB.PUBLSEQ
   AND PUB.PUBLSEQ       LIKE '%%%%%J'
   AND ASI.ITEMNO        =  ASA.ITEMNO
   AND ASA.ITEMNO        =  AUS.ITEMNO
   AND EXISTS
       (SELECT * 
          FROM INST1.CLUSTER_CONTROL CC
         WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
           AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
       )
   AND EXISTS
       (SELECT *
          FROM INST1.XCITATION XC
         WHERE XC.ITEMNO = ASI.ITEMNO
       )

Last edited by tonkuma; 01-06-10 at 08:01.
Reply With Quote
  #6 (permalink)  
Old 01-06-10, 11:24
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
If some of ASI.ITEMNO, JRF.ITEMNO, ASA.ITEMNO or AUS.ITEMNO are unique,
adding constraint UNIQUE(or CREATE UNIQUE index) on unique valued column(s) may let DB2 some more optimization.
Reply With Quote
  #7 (permalink)  
Old 01-06-10, 11:46
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
1) ASI, ISS and PUB are joined each other like this:
ALLSOURCES_ITEM ASI <---(ASI.ISSUENO = ISS.ISSUENO)---> ISSUE ISS <---(ISS.PUBLSEQ = PUB.PUBLSEQ)---> PUBLICATION PUB
and all three tables have some restrictive predicates.

2) JRF, ASA and AUS are joined with ASI by ITEMNO
Although, AUS is joined with ASA by ITEMNO, there is a condition "ASA.ITEMNO = ASI.ITEMNO".
So, following equation is true.
AUS.ITEMNO = ASA.ITEMNO = ASI.ITEMNO
then AUS.ITEMNO = ASI.ITEMNO


Here is a query considering to join ASI, ISS and PUB first, then to join JRF, ASA and AUS with ASI.
(It might be ineffectual with query rewrite mechanism of DB2 optimizer.)
Code:
SELECT 
       JRF.    REF_ORIGIN,
       ASI_PUB.JRNLREF_CLUSTER,
       ASI_PUB.ISSUENO,
       ASI_PUB.ITEMNO,
       ASA.    AUTHOR,
       AUS.    LAST_NAME,
       AUS.    FIRST_NAME,
       AUS.    PATRONYM,
       ASI_PUB.TITLE20,
       ASI_PUB.FULLTITLE,
       ASI_PUB.ISOTITLE,
       ASI_PUB.START_PAGE,
       ASI_PUB.VOLUME,
       ASI_PUB.PUBYEAR,
       JRF.    ARTICLENO,
       JRF.    DOI 
  FROM (SELECT DISTINCT
               ASI.JRNLREF_CLUSTER,
               ASI.ISSUENO,
               ASI.ITEMNO,
               ASI.TITLE20,
               ASI.START_PAGE,
               ASI.VOLUME,
               ASI.PUBYEAR,
               PUB.FULLTITLE,
               PUB.ISOTITLE
          FROM INST1.ALLSOURCES_ITEM     ASI
             , INST1.ISSUE               ISS
             , INST1.PUBLICATION         PUB
         WHERE
               ASI.START_PAGE    NOT LIKE 'NIL%'
           AND ASI.PUBYEAR       >  1899
           AND ASI.ISSUENO       =  ISS.ISSUENO
           AND ISS.DELETE_STATUS <> 'D'
           AND ISS.PUBLSEQ       =  PUB.PUBLSEQ
           AND PUB.PUBLSEQ       LIKE '%%%%%J'
           AND EXISTS
               (SELECT * 
                  FROM INST1.CLUSTER_CONTROL CC
                 WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
                   AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
               )
           AND EXISTS
               (SELECT *
                  FROM INST1.XCITATION XC
                 WHERE XC.ITEMNO = ASI.ITEMNO
               )
       ) ASI_PUB
     , INST1.JOURNAL_REFITEM     JRF
     , INST1.ALLSOURCES_AUTHSHIP ASA
     , INST1.AUTHORSHIP_SUPL     AUS
 WHERE
       ASI_PUB.ITEMNO = JRF.ITEMNO
   AND ASI_PUB.ITEMNO = ASA.ITEMNO
   AND ASI_PUB.ITEMNO = AUS.ITEMNO
;
Reply With Quote
  #8 (permalink)  
Old 01-07-10, 02:27
agb agb is offline
Registered User
 
Join Date: Jan 2010
Posts: 3
Quote:
Originally Posted by kandrusatish View Post
try the following code

select
JRF.REF_ORIGIN,

ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,

ASA.AUTHOR,

AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,

ASI.TITLE20,

PUB.FULLTITLE,
PUB.ISOTITLE,

ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,

JRF.ARTICLENO,
JRF.DOI


FROM INST1.ALLSOURCES_ITEM ASI

INNER JOIN INST1.ISSUE ISS
ON ASI.ISSUENO = ISS.ISSUENO

INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
ON ASI.ITEMNO = ASA.ITEMNO

INNER JOIN INST1.AUTHORSHIP_SUPL AUS
ON ASA.ITEMNO = AUS.ITEMNO

INNER JOIN INST1.JOURNAL_REFITEM JRF
ON ASI.ITEMNO = JRF.ITEMNO

INNER JOIN INST1.XCITATION XC
ON ASI.ITEMNO = XC.ITEMNO

INNER JOIN INST1.CLUSTER_CONTROL CC
ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER

INNER JOIN INST1.PUBLICATION PUB
ON ISS.PUBLSEQ = PUB.PUBLSEQ


WHERE
(PUB.PUBLSEQ LIKE '&#37;%%%%J') AND
(ASI.PUBYEAR > 1899) AND
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
(ISS.DELETE_STATUS <> 'D') AND
(ASI.START_PAGE NOT LIKE 'NIL%')
Hi Friend,
Still it is taking so much time. is any better way we can optimize.
Reply With Quote
  #9 (permalink)  
Old 01-07-10, 03:14
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
I guess you have the necessary indexes on the join columns and also the columns used in the WHERE clause? Have you collected the latest statistics, including column distribution?

Have you tried creating an MQT? Since this is a more complex query, I assume we have more of a warehouse instead an OLTP application and, thus, can tolerate the additional overhead for maintaining MQTs. If you do have an OLTP-style application, you may want to work on the table design - joining 8 tables will take its time. So you can partition the tables horizontally or vertically to reduce the data volume to be scanned, for instance. Another option is to reorganize the table in such a way that the typically accessed data is close to each other in terms of placement on the data pages, i.e. improve the benefits of prefetching.

But all this are the usual performance tuning things - without knowing the access plan and typical workload, one cannot be more specific. For example, if this is just one long-running query and you have n other queries that are just fine, changing the database structure may not be the best move...
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #10 (permalink)  
Old 01-07-10, 03:38
kandrusatish kandrusatish is offline
Registered User
 
Join Date: Jun 2008
Location: Chennai
Posts: 21
Hi Agb,

actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.

the query need to be tuned to avoid the full table scan ... and that's why i'm unable to give any idea unless doing some research on indexes for the above query using the query execution plan

and also i have adoubt on using the statement PUB.PUBLSEQ LIKE '&#37;%%%%J'

if you are trying to find all the 6 character strings having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '_____J' (replace '%' with '_')

or if you are trying to find all the strings of any length and having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '%J'

Please change it accordingly ...
Reply With Quote
  #11 (permalink)  
Old 01-07-10, 03:55
kandrusatish kandrusatish is offline
Registered User
 
Join Date: Jun 2008
Location: Chennai
Posts: 21
sorry agb, i don't know about the usage of wildcard characters for LIKE operator in DB2. I raised the above doubt in the perspective of MS SQL server
Reply With Quote
  #12 (permalink)  
Old 01-07-10, 05:40
dr_te_z dr_te_z is offline
Registered User
 
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
Quote:
Originally Posted by kandrusatish View Post
actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.
Did you pay special attention to the table/columns in your RUNSTATS? Did you use the "like-statistics" option? Might help
Reply With Quote
  #13 (permalink)  
Old 01-07-10, 06:47
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
Quote:
Hi Friend,
Still it is taking so much time. is any better way we can optimize.
agb,
I already gave you three suggestions.
If they were not effective, please inform me.
Reply With Quote
Reply

Tags
db2, query

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On