If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Help with long running UPDATE statement

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-28-11, 20:16
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Smile Help with long running UPDATE statement

Hi, I'm hoping to get some advice on how best to optimize an update statement in DB2 LUW 9.5

We have a table that holds approximately 1.5 billion rows. An ETL process I've just started to look at is broken up into 3 steps:

Step 1 inserts rows into a table. This table has a surrogate primary key (rec_seq_nbr) and a composite index consisting of 8 columns. These columns are the Natural business key. When the rows are inserted into the table a column (curr_ind) takes on a value of 'X'. This steps itself run relatively quick.

Step 2 is where I require some advice. It takes 11+ hours to complete. This step consists an Update statement that looks like this:

update t1 set curr_ind = '0'
where rec_seq_nbr in
(select b.rec_seq_nbr
from (select rec_seq_nbr, col1,col2,col3,col4,col5,col6,col7,col8 from T1
where curr_ind = 'X') as A, T1 AS B
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and b.rec_seq_nbr < a.rec_seq_nbr)

(as you deduced there can be duplicates and they want to set the first duplicate with the smaller REC_SEQ_NBR to '0')

Step 3 takes one more pass and updates all rows where curr_ind = 'X' and sets it to '1'.

My question, step 2's Explain plan is ugly as at least 3 tablescans are performed. I've thought about a 'GROUP BY' with a MIN function but the 1.5 billion rows frightens me. Also, why couldn't I perform steps 2 & 3 in one sql job?

Additional info, CURR_IND doesn't belong to a Index. As this column can have 4 states I was pondering about making it a BITMAP index.

Greatly appreciate any help.....
Reply With Quote
  #2 (permalink)  
Old 11-28-11, 21:20
stiruvee stiruvee is offline
Registered User
 
Join Date: May 2010
Location: India
Posts: 34
If you can modify ETL process, it might be better to update existing records to '0' using the composite key as step 1 and insert new records with value '1' as step 2.

If you can not change ETL process, see whether following modification to sql helps your process.

update t1 B set curr_ind = '0'
where exists
(select 1
from T1 as A
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and a.curr_ind = 'X'
and b.rec_seq_nbr < a.rec_seq_nbr)
and b.curr_ind = '1'


....Satya
Reply With Quote
  #3 (permalink)  
Old 11-28-11, 21:36
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Help with long running UPDATE statment

Thanks Satya, the Exists makes sense, but I'm not sure what the last statement "and b.curr_ind = '1'" is for?
Reply With Quote
  #4 (permalink)  
Old 11-28-11, 21:53
stiruvee stiruvee is offline
Registered User
 
Join Date: May 2010
Location: India
Posts: 34
From your first post, it looks like for same composite key, you might have one record with curr_ind '1' and other older records with curr_ind '0'. By adding the condition for curr_ind ='1', you will update only one record.


Satya...
Reply With Quote
  #5 (permalink)  
Old 11-28-11, 22:01
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Help with long running Update statement

Hi again, after the first pass I could possible have duplicate rows with the same natural key (composite index) but the PK will be different. I want the PK with the lower value (but same business key) to take on a value of '0' for CURR_IND. This implies it's not current. The next pass through will set CURR_IND to '1' (most current) for all remaining 'X'.
Reply With Quote
  #6 (permalink)  
Old 11-28-11, 22:24
stiruvee stiruvee is offline
Registered User
 
Join Date: May 2010
Location: India
Posts: 34
Ok, In that case, you might want to change it to b.curr_ind <> '0' to avoid updating older records which are already set '0'.

update t1 B set curr_ind = '0'
where exists
(select 1
from T1 as A
where a.col1 = b.col1
and a.col2 = b.col2
and a.col3 = b.col3
and a.col4 = b.col4
and a.col5 = b.col5
and a.col6 = b.col6
and a.col7 = b.col7
and a.col8 = b.col8
and a.curr_ind = 'X'
and b.rec_seq_nbr < a.rec_seq_nbr)
and b.curr_ind <> '0'
Reply With Quote
  #7 (permalink)  
Old 11-29-11, 02:47
fengsun2 fengsun2 is offline
Registered User
 
Join Date: Nov 2011
Posts: 124
Please post the access plan for sql of step 2 here(db2exfmt output)。
thx
Reply With Quote
  #8 (permalink)  
Old 11-29-11, 04:48
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
1) Step 2 and step 3 may be conbined into a MERGE statement.

2) I thought the following indexes might be usefull.
(col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , rec_seq_nbr DESC, curr_ind)
or
(curr_ind , col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , rec_seq_nbr DESC)

But, I didn't check performance by those ideas.


Example 1: MERGE statement for Step 2 and Step 3.
Code:
MERGE INTO
(SELECT curr_ind
      , ROW_NUMBER()
          OVER(PARTITION BY col1 , col2 , col3 , col4
                          , col5 , col6 , col7 , col8
                   ORDER BY rec_seq_nbr DESC
              ) AS rnum
  FROM  t1
  WHERE curr_ind = 'X'
) t1
USING (VALUES 0) dummy(d)
  ON  d = d
WHEN MATCHED
 AND rnum = 1 THEN
UPDATE
   SET curr_ind = '1'
WHEN MATCHED  THEN
UPDATE
   SET curr_ind = '0'
;

Last edited by tonkuma; 11-29-11 at 05:41. Reason: Change sample code made by my misunderstanding of requirements.
Reply With Quote
  #9 (permalink)  
Old 11-29-11, 05:00
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
The UPDATE statement may be better than MERGE in Example 1.

Example 2:
Code:
UPDATE
(SELECT *
  FROM  (SELECT curr_ind
              , ROW_NUMBER()
                   OVER(PARTITION BY col1 , col2 , col3 , col4
                                   , col5 , col6 , col7 , col8
                            ORDER BY rec_seq_nbr DESC
                       ) AS rnum
          FROM  t1
        )
  WHERE curr_ind = 'X'
)
SET curr_ind
  = CASE rnum
    WHEN 1 THEN '1'
    ELSE        '0'
    END
;

Last edited by tonkuma; 11-29-11 at 05:41. Reason: Change sample code made by my misunderstanding of requirements.
Reply With Quote
  #10 (permalink)  
Old 11-29-11, 05:32
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
Removed.
Perhaps, my poor English capability made me misunderstanding of requirements.

Last edited by tonkuma; 11-29-11 at 05:38.
Reply With Quote
  #11 (permalink)  
Old 11-29-11, 08:35
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Help with long running UPDATE statement

Actually your 'English' is perfect. I will test out both suggestions. thanks
Reply With Quote
  #12 (permalink)  
Old 11-29-11, 10:03
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
This seems better performance with index
(col1 , col2 , col3 , col4 , col5 , col6 , col7 , col8 , curr_ind , rec_seq_nbr DESC)

Example 3: for Step 2 and Step 3.
Code:
UPDATE t1 t
   SET curr_ind
     = COALESCE(
          (SELECT LEFT( '0' , SIGN(MAX(s.rec_seq_nbr)) )
            FROM  t1 s
            WHERE s.col1        = t.col1
              AND s.col2        = t.col2
              AND s.col3        = t.col3
              AND s.col4        = t.col4
              AND s.col5        = t.col5
              AND s.col6        = t.col6
              AND s.col7        = t.col7
              AND s.col8        = t.col8
              AND s.rec_seq_nbr > t.rec_seq_nbr
              AND s.curr_ind    = 'X'
          )
        , '1'
       )
 WHERE curr_ind    = 'X'
;

Last edited by tonkuma; 11-29-11 at 10:09. Reason: Revise code(remove qualifier of t1, add conditions for co5 to col8)
Reply With Quote
  #13 (permalink)  
Old 11-29-11, 12:17
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Help with long running UPDATE statement

hi, not sure if an aggreagate fuction is allowed in the SET clause of an UPDATE statement

Getting the following message:

SQL0120N
Invalid use of an aggregate function or OLAP function.

Explanation
An aggregate function or OLAP function can only be used in the select list of a fullselect, the having clause, or, with restrictions, in a WHERE clause or GROUP BY clause.

A WHERE clause can contain an aggregate function or OLAP function only if that clause appears within a subquery of a HAVING clause and the argument of the function is a correlated reference to a group.

A GROUP BY clause can contain an aggregate function or OLAP function only if the argument of the function is a correlated reference to a column in a different subselect than the one containing the GROUP BY clause.

An OLAP function cannot be used within the argument list of an XMLQUERY or XMLEXISTS expression.

The statement cannot be processed.
Reply With Quote
  #14 (permalink)  
Old 11-29-11, 14:30
chippib chippib is offline
Registered User
 
Join Date: Nov 2011
Posts: 21
Help with long running UPDATE statement

My last post was incorrect, my aplogies. I get the following error when I attempt to Explain the SQL

No authorized routine named 'LEFT' of type 'Function' having compaitble arguments was found. SQLcode = -440, SQLSTATE = 42884
Reply With Quote
  #15 (permalink)  
Old 11-29-11, 19:50
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
It was successful on my DB2 Express-C 9.7.5 for Windows,
by using small test data(and simplified DDL).

Did you used exactly same expression?
(SELECT LEFT( '0' , SIGN(MAX(s.rec_seq_nbr)) )

If error was still returned, please try
(SELECT MAX('0')
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On