| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |
|

11-28-11, 20:16
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
Help with long running UPDATE statement
|
|
Hi, I'm hoping to get some advice on how best to optimize an update statement in DB2 LUW 9.5
We have a table that holds approximately 1.5 billion rows. An ETL process I've just started to look at is broken up into 3 steps:
Step 1 inserts rows into a table. This table has a surrogate primary key (rec_seq_nbr) and a composite index consisting of 8 columns. These columns are the Natural business key. When the rows are inserted into the table a column (curr_ind) takes on a value of 'X'. This steps itself run relatively quick.
Step 2 is where I require some advice. It takes 11+ hours to complete. This step consists an Update statement that looks like this:
update t1 set curr_ind = '0'
where rec_seq_nbr in
(select b.rec_seq_nbr
from (select rec_seq_nbr, col1,col2,col3,col4,col5,col6,col7,col8 from T1
where curr_ind = 'X') as A, T1 AS B
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and b.rec_seq_nbr < a.rec_seq_nbr)
(as you deduced there can be duplicates and they want to set the first duplicate with the smaller REC_SEQ_NBR to '0')
Step 3 takes one more pass and updates all rows where curr_ind = 'X' and sets it to '1'.
My question, step 2's Explain plan is ugly as at least 3 tablescans are performed. I've thought about a 'GROUP BY' with a MIN function but the 1.5 billion rows frightens me. Also, why couldn't I perform steps 2 & 3 in one sql job?
Additional info, CURR_IND doesn't belong to a Index. As this column can have 4 states I was pondering about making it a BITMAP index.
Greatly appreciate any help.....
|
|

11-28-11, 21:20
|
|
Registered User
|
|
Join Date: May 2010
Location: India
Posts: 34
|
|
If you can modify ETL process, it might be better to update existing records to '0' using the composite key as step 1 and insert new records with value '1' as step 2.
If you can not change ETL process, see whether following modification to sql helps your process.
update t1 B set curr_ind = '0'
where exists
(select 1
from T1 as A
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and a.curr_ind = 'X'
and b.rec_seq_nbr < a.rec_seq_nbr)
and b.curr_ind = '1'
....Satya
|
|

11-28-11, 21:36
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
|
Help with long running UPDATE statment
|
|
Thanks Satya, the Exists makes sense, but I'm not sure what the last statement "and b.curr_ind = '1'" is for?
|
|

11-28-11, 21:53
|
|
Registered User
|
|
Join Date: May 2010
Location: India
Posts: 34
|
|
From your first post, it looks like for same composite key, you might have one record with curr_ind '1' and other older records with curr_ind '0'. By adding the condition for curr_ind ='1', you will update only one record.
Satya...
|
|

11-28-11, 22:01
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
|
Help with long running Update statement
Hi again, after the first pass I could possible have duplicate rows with the same natural key (composite index) but the PK will be different. I want the PK with the lower value (but same business key) to take on a value of '0' for CURR_IND. This implies it's not current. The next pass through will set CURR_IND to '1' (most current) for all remaining 'X'.
|
|

11-28-11, 22:24
|
|
Registered User
|
|
Join Date: May 2010
Location: India
Posts: 34
|
|
Ok, In that case, you might want to change it to b.curr_ind <> '0' to avoid updating older records which are already set '0'.
update t1 B set curr_ind = '0'
where exists
(select 1
from T1 as A
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and a.curr_ind = 'X'
and b.rec_seq_nbr < a.rec_seq_nbr)
and b.curr_ind <> '0'
|
|

11-29-11, 02:47
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 124
|
|
Please post the access plan for sql of step 2 here(db2exfmt output)。
thx
|
|

11-29-11, 04:48
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
1) Step 2 and step 3 may be conbined into a MERGE statement.
2) I thought the following indexes might be usefull.
(col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , rec_seq_nbr DESC, curr_ind)
or
(curr_ind , col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , rec_seq_nbr DESC)
But, I didn't check performance by those ideas.
Example 1: MERGE statement for Step 2 and Step 3.
Code:
MERGE INTO
(SELECT curr_ind
, ROW_NUMBER()
OVER(PARTITION BY col1 , col2 , col3 , col4
, col5 , col6 , col7 , col8
ORDER BY rec_seq_nbr DESC
) AS rnum
FROM t1
WHERE curr_ind = 'X'
) t1
USING (VALUES 0) dummy(d)
ON d = d
WHEN MATCHED
AND rnum = 1 THEN
UPDATE
SET curr_ind = '1'
WHEN MATCHED THEN
UPDATE
SET curr_ind = '0'
;
|
Last edited by tonkuma; 11-29-11 at 05:41.
Reason: Change sample code made by my misunderstanding of requirements.
|

11-29-11, 05:00
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
The UPDATE statement may be better than MERGE in Example 1.
Example 2:
Code:
UPDATE
(SELECT *
FROM (SELECT curr_ind
, ROW_NUMBER()
OVER(PARTITION BY col1 , col2 , col3 , col4
, col5 , col6 , col7 , col8
ORDER BY rec_seq_nbr DESC
) AS rnum
FROM t1
)
WHERE curr_ind = 'X'
)
SET curr_ind
= CASE rnum
WHEN 1 THEN '1'
ELSE '0'
END
;
|
Last edited by tonkuma; 11-29-11 at 05:41.
Reason: Change sample code made by my misunderstanding of requirements.
|

11-29-11, 05:32
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
Removed.
Perhaps, my poor English capability made me misunderstanding of requirements.
|
Last edited by tonkuma; 11-29-11 at 05:38.
|

11-29-11, 08:35
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
|
Help with long running UPDATE statement
Actually your 'English' is perfect. I will test out both suggestions. thanks
|
|

11-29-11, 10:03
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
This seems better performance with index
(col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , curr_ind , rec_seq_nbr DESC)
Example 3: for Step 2 and Step 3.
Code:
UPDATE t1 t
SET curr_ind
= COALESCE(
(SELECT LEFT( '0' , SIGN(MAX(s.rec_seq_nbr)) )
FROM t1 s
WHERE s.col1 = t.col1
AND s.col2 = t.col2
AND s.col3 = t.col3
AND s.col4 = t.col4
AND s.col5 = t.col5
AND s.col6 = t.col6
AND s.col7 = t.col7
AND s.col8 = t.col8
AND s.rec_seq_nbr > t.rec_seq_nbr
AND s.curr_ind = 'X'
)
, '1'
)
WHERE curr_ind = 'X'
;
|
Last edited by tonkuma; 11-29-11 at 10:09.
Reason: Revise code(remove qualifier of t1, add conditions for co5 to col8)
|

11-29-11, 12:17
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
|
Help with long running UPDATE statement
hi, not sure if an aggreagate fuction is allowed in the SET clause of an UPDATE statement
Getting the following message:
SQL0120N
Invalid use of an aggregate function or OLAP function.
Explanation
An aggregate function or OLAP function can only be used in the select list of a fullselect, the having clause, or, with restrictions, in a WHERE clause or GROUP BY clause.
A WHERE clause can contain an aggregate function or OLAP function only if that clause appears within a subquery of a HAVING clause and the argument of the function is a correlated reference to a group.
A GROUP BY clause can contain an aggregate function or OLAP function only if the argument of the function is a correlated reference to a column in a different subselect than the one containing the GROUP BY clause.
An OLAP function cannot be used within the argument list of an XMLQUERY or XMLEXISTS expression.
The statement cannot be processed.
|
|

11-29-11, 14:30
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 21
|
|
|
Help with long running UPDATE statement
My last post was incorrect, my aplogies. I get the following error when I attempt to Explain the SQL
No authorized routine named 'LEFT' of type 'Function' having compaitble arguments was found. SQLcode = -440, SQLSTATE = 42884
|
|

11-29-11, 19:50
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
It was successful on my DB2 Express-C 9.7.5 for Windows,
by using small test data(and simplified DDL).
Did you used exactly same expression?
(SELECT LEFT( '0' , SIGN(MAX(s.rec_seq_nbr)) )
If error was still returned, please try
(SELECT MAX('0')
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|