| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

01-06-10, 02:17
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 3
|
|
|
Db2 Query optimization to reduce cycle time
|
|
Hi Friends,
Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:
Code:
select
JRF.REF_ORIGIN,
ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,
ASA.AUTHOR,
AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,
ASI.TITLE20,
PUB.FULLTITLE,
PUB.ISOTITLE,
ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,
JRF.ARTICLENO,
JRF.DOI
FROM INST1.ALLSOURCES_ITEM ASI,
INST1.CLUSTER_CONTROL CC,
INST1.JOURNAL_REFITEM JRF,
INST1.XCITATION XC,
INST1.ISSUE ISS,
INST1.PUBLICATION PUB,
INST1.ALLSOURCES_AUTHSHIP ASA,
INST1.AUTHORSHIP_SUPL AUS
WHERE (PUB.PUBLSEQ LIKE '%%%%%J') AND
(ASI.PUBYEAR > 1899) AND
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
(ISS.DELETE_STATUS <> 'D') AND
(ASI.START_PAGE NOT LIKE 'NIL%') AND
(ASI.ISSUENO = ISS.ISSUENO) AND
(ASI.ITEMNO = ASA.ITEMNO) AND
(ASA.ITEMNO = AUS.ITEMNO) AND
(ASI.ITEMNO = JRF.ITEMNO) AND
(ASI.ITEMNO = XC.ITEMNO) AND
(ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER) AND
(ISS.PUBLSEQ = PUB.PUBLSEQ)
|
|

01-06-10, 06:45
|
|
Registered User
|
|
Join Date: Jun 2008
Location: Chennai
Posts: 21
|
|
try the following code
select
JRF.REF_ORIGIN,
ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,
ASA.AUTHOR,
AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,
ASI.TITLE20,
PUB.FULLTITLE,
PUB.ISOTITLE,
ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,
JRF.ARTICLENO,
JRF.DOI
FROM INST1.ALLSOURCES_ITEM ASI
INNER JOIN INST1.ISSUE ISS
ON ASI.ISSUENO = ISS.ISSUENO
INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
ON ASI.ITEMNO = ASA.ITEMNO
INNER JOIN INST1.AUTHORSHIP_SUPL AUS
ON ASA.ITEMNO = AUS.ITEMNO
INNER JOIN INST1.JOURNAL_REFITEM JRF
ON ASI.ITEMNO = JRF.ITEMNO
INNER JOIN INST1.XCITATION XC
ON ASI.ITEMNO = XC.ITEMNO
INNER JOIN INST1.CLUSTER_CONTROL CC
ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER
INNER JOIN INST1.PUBLICATION PUB
ON ISS.PUBLSEQ = PUB.PUBLSEQ
WHERE
(PUB.PUBLSEQ LIKE '%%%%%J') AND
(ASI.PUBYEAR > 1899) AND
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
(ISS.DELETE_STATUS <> 'D') AND
(ASI.START_PAGE NOT LIKE 'NIL%')
|
|

01-06-10, 07:07
|
|
:-)
|
|
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
|
|
|
|
Quote:
Originally Posted by agb
Below query is taking so much time to process, can anybody suggest the better optimization technique for this query:
|
You should look at the query explain plan and dynamic SQL snapshot (if the query is dynamic) to identify improvement opportunities.
|
|

01-06-10, 07:28
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 3
|
|
it is taking much time again.
|
|

01-06-10, 07:51
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
1) Your SELECT list doesn't include columns of
INST1.CLUSTER_CONTROL CC
INST1.XCITATION XC
INST1.ISSUE ISS
So, referencing to "INST1.CLUSTER_CONTROL CC"
and "INST1.XCITATION XC" might be replaced by EXISTS predicates,
if duplicate results could be neglected.
("INST1.ISSUE ISS" would be necessary to join with "INST1.PUBLICATION PUB")
2) Is an index used for the following predicate?
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15)
You can try something like this.....
(I don't like redundant parentheses. I don't think that will improve readability.)
(Changing sequence of predicates are for readability. Not expected performance improvement.)
Code:
SELECT
JRF.REF_ORIGIN,
ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,
ASA.AUTHOR,
AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,
ASI.TITLE20,
PUB.FULLTITLE,
PUB.ISOTITLE,
ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,
JRF.ARTICLENO,
JRF.DOI
FROM INST1.ALLSOURCES_ITEM ASI
, INST1.JOURNAL_REFITEM JRF
, INST1.ISSUE ISS
, INST1.PUBLICATION PUB
, INST1.ALLSOURCES_AUTHSHIP ASA
, INST1.AUTHORSHIP_SUPL AUS
WHERE ASI.START_PAGE NOT LIKE 'NIL%'
AND ASI.PUBYEAR > 1899
AND ASI.ITEMNO = JRF.ITEMNO
AND ASI.ISSUENO = ISS.ISSUENO
AND ISS.DELETE_STATUS <> 'D'
AND ISS.PUBLSEQ = PUB.PUBLSEQ
AND PUB.PUBLSEQ LIKE '%%%%%J'
AND ASI.ITEMNO = ASA.ITEMNO
AND ASA.ITEMNO = AUS.ITEMNO
AND EXISTS
(SELECT *
FROM INST1.CLUSTER_CONTROL CC
WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
)
AND EXISTS
(SELECT *
FROM INST1.XCITATION XC
WHERE XC.ITEMNO = ASI.ITEMNO
)
|
Last edited by tonkuma; 01-06-10 at 08:01.
|

01-06-10, 11:24
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
If some of ASI.ITEMNO, JRF.ITEMNO, ASA.ITEMNO or AUS.ITEMNO are unique,
adding constraint UNIQUE(or CREATE UNIQUE index) on unique valued column(s) may let DB2 some more optimization.
|
|

01-06-10, 11:46
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
1) ASI, ISS and PUB are joined each other like this:
ALLSOURCES_ITEM ASI <---(ASI.ISSUENO = ISS.ISSUENO)---> ISSUE ISS <---(ISS.PUBLSEQ = PUB.PUBLSEQ)---> PUBLICATION PUB
and all three tables have some restrictive predicates.
2) JRF, ASA and AUS are joined with ASI by ITEMNO
Although, AUS is joined with ASA by ITEMNO, there is a condition "ASA.ITEMNO = ASI.ITEMNO".
So, following equation is true.
AUS.ITEMNO = ASA.ITEMNO = ASI.ITEMNO
then AUS.ITEMNO = ASI.ITEMNO
Here is a query considering to join ASI, ISS and PUB first, then to join JRF, ASA and AUS with ASI.
(It might be ineffectual with query rewrite mechanism of DB2 optimizer.)
Code:
SELECT
JRF. REF_ORIGIN,
ASI_PUB.JRNLREF_CLUSTER,
ASI_PUB.ISSUENO,
ASI_PUB.ITEMNO,
ASA. AUTHOR,
AUS. LAST_NAME,
AUS. FIRST_NAME,
AUS. PATRONYM,
ASI_PUB.TITLE20,
ASI_PUB.FULLTITLE,
ASI_PUB.ISOTITLE,
ASI_PUB.START_PAGE,
ASI_PUB.VOLUME,
ASI_PUB.PUBYEAR,
JRF. ARTICLENO,
JRF. DOI
FROM (SELECT DISTINCT
ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,
ASI.TITLE20,
ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,
PUB.FULLTITLE,
PUB.ISOTITLE
FROM INST1.ALLSOURCES_ITEM ASI
, INST1.ISSUE ISS
, INST1.PUBLICATION PUB
WHERE
ASI.START_PAGE NOT LIKE 'NIL%'
AND ASI.PUBYEAR > 1899
AND ASI.ISSUENO = ISS.ISSUENO
AND ISS.DELETE_STATUS <> 'D'
AND ISS.PUBLSEQ = PUB.PUBLSEQ
AND PUB.PUBLSEQ LIKE '%%%%%J'
AND EXISTS
(SELECT *
FROM INST1.CLUSTER_CONTROL CC
WHERE CC.JRNLREF_CLUSTER = ASI.JRNLREF_CLUSTER
AND CC.LAST_CITE_DATE <= current date - DAYOFYEAR(current date)DAYs - 15 YEARs
)
AND EXISTS
(SELECT *
FROM INST1.XCITATION XC
WHERE XC.ITEMNO = ASI.ITEMNO
)
) ASI_PUB
, INST1.JOURNAL_REFITEM JRF
, INST1.ALLSOURCES_AUTHSHIP ASA
, INST1.AUTHORSHIP_SUPL AUS
WHERE
ASI_PUB.ITEMNO = JRF.ITEMNO
AND ASI_PUB.ITEMNO = ASA.ITEMNO
AND ASI_PUB.ITEMNO = AUS.ITEMNO
;
|
|

01-07-10, 02:27
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 3
|
|
Quote:
Originally Posted by kandrusatish
try the following code
select
JRF.REF_ORIGIN,
ASI.JRNLREF_CLUSTER,
ASI.ISSUENO,
ASI.ITEMNO,
ASA.AUTHOR,
AUS.LAST_NAME,
AUS.FIRST_NAME,
AUS.PATRONYM,
ASI.TITLE20,
PUB.FULLTITLE,
PUB.ISOTITLE,
ASI.START_PAGE,
ASI.VOLUME,
ASI.PUBYEAR,
JRF.ARTICLENO,
JRF.DOI
FROM INST1.ALLSOURCES_ITEM ASI
INNER JOIN INST1.ISSUE ISS
ON ASI.ISSUENO = ISS.ISSUENO
INNER JOIN INST1.ALLSOURCES_AUTHSHIP ASA
ON ASI.ITEMNO = ASA.ITEMNO
INNER JOIN INST1.AUTHORSHIP_SUPL AUS
ON ASA.ITEMNO = AUS.ITEMNO
INNER JOIN INST1.JOURNAL_REFITEM JRF
ON ASI.ITEMNO = JRF.ITEMNO
INNER JOIN INST1.XCITATION XC
ON ASI.ITEMNO = XC.ITEMNO
INNER JOIN INST1.CLUSTER_CONTROL CC
ON ASI.JRNLREF_CLUSTER = CC.JRNLREF_CLUSTER
INNER JOIN INST1.PUBLICATION PUB
ON ISS.PUBLSEQ = PUB.PUBLSEQ
WHERE
(PUB.PUBLSEQ LIKE '%%%%%J') AND
(ASI.PUBYEAR > 1899) AND
((YEAR(current timestamp)-year(CC.LAST_CITE_DATE))>15) AND
(ISS.DELETE_STATUS <> 'D') AND
(ASI.START_PAGE NOT LIKE 'NIL%')
|
Hi Friend,
Still it is taking so much time. is any better way we can optimize.
|
|

01-07-10, 03:14
|
|
Registered User
|
|
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
|
|
I guess you have the necessary indexes on the join columns and also the columns used in the WHERE clause? Have you collected the latest statistics, including column distribution?
Have you tried creating an MQT? Since this is a more complex query, I assume we have more of a warehouse instead an OLTP application and, thus, can tolerate the additional overhead for maintaining MQTs. If you do have an OLTP-style application, you may want to work on the table design - joining 8 tables will take its time. So you can partition the tables horizontally or vertically to reduce the data volume to be scanned, for instance. Another option is to reorganize the table in such a way that the typically accessed data is close to each other in terms of placement on the data pages, i.e. improve the benefits of prefetching.
But all this are the usual performance tuning things - without knowing the access plan and typical workload, one cannot be more specific. For example, if this is just one long-running query and you have n other queries that are just fine, changing the database structure may not be the best move...
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
|
|

01-07-10, 03:38
|
|
Registered User
|
|
Join Date: Jun 2008
Location: Chennai
Posts: 21
|
|
Hi Agb,
actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.
the query need to be tuned to avoid the full table scan ... and that's why i'm unable to give any idea unless doing some research on indexes for the above query using the query execution plan
and also i have adoubt on using the statement PUB.PUBLSEQ LIKE '%%%%%J'
if you are trying to find all the 6 character strings having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '_____J' (replace '%' with '_')
or if you are trying to find all the strings of any length and having the last character as 'J', then the above statement should be written as PUB.PUBLSEQ LIKE '%J'
Please change it accordingly ...
|
|

01-07-10, 03:55
|
|
Registered User
|
|
Join Date: Jun 2008
Location: Chennai
Posts: 21
|
|
sorry agb, i don't know about the usage of wildcard characters for LIKE operator in DB2. I raised the above doubt in the perspective of MS SQL server
|
|

01-07-10, 05:40
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
Quote:
Originally Posted by kandrusatish
actually using two LIKE operators in the Where clause is causing full table scan and taking more time for execution.
|
Did you pay special attention to the table/columns in your RUNSTATS? Did you use the "like-statistics" option? Might help
|
|

01-07-10, 06:47
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
Quote:
Hi Friend,
Still it is taking so much time. is any better way we can optimize.
|
agb,
I already gave you three suggestions.
If they were not effective, please inform me.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|