If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Best way to delete a huge number of rows?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-29-09, 17:43
db2user24 db2user24 is offline
Registered User
 
Join Date: Nov 2007
Posts: 248
Best way to delete a huge number of rows?

Hi,

What's the best way to delete a substantial number of rows from a table? This particular table is HUGE so I can't export, drop and import the necessary data like I have done for other tables in our db. Is the ideal way to just create a stored procedure with startdate, enddate parameters.. pass these params to the delete command and then COMMIT;? Or would you recommend some other method? Thanks!!
Reply With Quote
  #2 (permalink)  
Old 10-29-09, 17:45
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
Can you export what you want to keep and then load/replace? Load builds pages and writes them to the table so it's fast plus very minimal logging
Reply With Quote
  #3 (permalink)  
Old 10-29-09, 17:48
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
Reply With Quote
  #4 (permalink)  
Old 10-29-09, 17:55
db2user24 db2user24 is offline
Registered User
 
Join Date: Nov 2007
Posts: 248
Quote:
Originally Posted by db2girl View Post
Can you export what you want to keep and then load/replace? Load builds pages and writes them to the table so it's fast plus very minimal logging
Unfortunately, I cannot since this table is humongous.. deleting rows is the only way to go..
Reply With Quote
  #5 (permalink)  
Old 10-29-09, 18:01
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by db2user24 View Post
Unfortunately, I cannot since this table is humongous.. deleting rows is the only way to go..
I don't see how this is related to the table size (by the way, "humongous" means different things to different people...) Would you rather run your delete SP for a couple of months?
Reply With Quote
  #6 (permalink)  
Old 10-29-09, 18:28
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
This has been discussed several times.

I have written stored procedures to do deletes on very large tables. The stored procedure has input parameters to specify how many minutes the SP would run, or a certain number of rows to be deleted, before the SP ended so that I could run it during non-prime-time hours on a scheduled basis for as many nights as it took to delete all the required rows (based on another input criteria such as date range).

The stored procedure opens a cursor based on the criteria to be deleted. The Cursor is defined using WITH HOLD option, so the cursor is not closed when a commit is issued. The cursor is also defined WITH UR to minimize an lock contention.

The SP fetches rows from the cursor and if the row is to be deleted, then the SP issues a separate delete based on the PK. Do not use "delete where current of cursor".

After a certain number of deletes (as specified by another input parameter) the SP does a commit. This releases locks on any rows that have been deleted. I usually do somewhere between 100 and 1000 deletes in a single commit interval. Committing less often will make the SP only slightly faster and could degrade concurrency. Even if concurrency is not a big issue, I would not go higher than 5000 deletes in a commit interval. Make sure you have one final commit at the end of your SP which may be necessary depending on your SP logic.

Also make sure you are doing archive log maintenance on a regular basis so that the archive log path does not fill up.

I would set the following environment variable to increase concurrency:

db2set DB2_SKIPDELETED=YES (you will need to restart instance)
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #7 (permalink)  
Old 10-29-09, 18:48
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
If you plan to take a backup after the DELETE operation, you could also turn on NOT LOGGED INITIALLY for the table and cut short on the logging overhead.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #8 (permalink)  
Old 10-29-09, 18:52
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Quote:
Originally Posted by stolze View Post
If you plan to take a backup after the DELETE operation, you could also turn on NOT LOGGED INITIALLY for the table and cut short on the logging overhead.
Given that the table in question is apparently extremely large, make sure you update your resume before you try this.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #9 (permalink)  
Old 10-29-09, 19:44
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Quote:
Originally Posted by Marcus_A View Post
Given that the table in question is apparently extremely large, make sure you update your resume before you try this.
Probably you are right. If something goes wrong during the delete, you're in deep trouble restoring the data again.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #10 (permalink)  
Old 10-30-09, 04:52
dr_te_z dr_te_z is offline
Registered User
 
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
And this option
Code:
ALTER TABLE EMPLOYEE LOCKSIZE TABLE
is that not relevant for this issue. Also activate "infinite loggin" (logsecond := -1) and go-and-delete.
Just hope that you do not have to rollback after a couple of hours
Reply With Quote
  #11 (permalink)  
Old 10-30-09, 11:11
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Quote:
Originally Posted by dr_te_z View Post
And this option
Code:
ALTER TABLE EMPLOYEE LOCKSIZE TABLE
is that not relevant for this issue. Also activate "infinite loggin" (logsecond := -1) and go-and-delete.
Just hope that you do not have to rollback after a couple of hours
You would only want to lock the table if no one else is using it. Besides, DB2 will escalate to table lock automatically if it runs out of memory for row locks.

It seems to me that infinite logging only works if you have enough disk space to hold all the logs for a single transaction. That seems unlikely in the case of the OP. Maybe I am missing something? Also, if this is an HADR database, infinite logging is not allowed.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #12 (permalink)  
Old 10-30-09, 17:12
dr_te_z dr_te_z is offline
Registered User
 
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
Quote:
Originally Posted by Marcus_A View Post
DB2 will escalate to table lock automatically if it runs out of memory for row locks.
That's true, you know it is going to happen, so why not set it yourself and save all the initial locking overhead?

Quote:
Originally Posted by Marcus_A View Post
infinite logging only works if you have enough disk space to hold all the logs for a single transaction.
Also true but the alternative is calculating the right number of secondary logs. When that is not big enough, you will have room on disk but the transaction will fail afterall.When you set that big enough, the existing problem of having enough disk to hold all those logs is still valid, so... I like infinite logging .
Reply With Quote
  #13 (permalink)  
Old 10-30-09, 17:26
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
Quote:
Originally Posted by dr_te_z View Post
That's true, you know it is going to happen, so why not set it yourself and save all the initial locking overhead?

Also true but the alternative is calculating the right number of secondary logs. When that is not big enough, you will have room on disk but the transaction will fail afterall.When you set that big enough, the existing problem of having enough disk to hold all those logs is still valid, so... I like infinite logging .
None of those is the proper solution IMO. The best way is to write a SP as I described above.

You are assuming that no one else needs to use the database while the deletes are being done, and I was not assuming that. I don't know which is the actual situation for the OP.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #14 (permalink)  
Old 11-02-09, 15:06
db2user24 db2user24 is offline
Registered User
 
Join Date: Nov 2007
Posts: 248
Thanks Marcus, you are right in your assumption...I would prefer to have something running during off peak hours so it doesn't affect the performance of our app during regular business hours..sproc sounds like it's the way to go..
Reply With Quote
  #15 (permalink)  
Old 11-02-09, 16:39
sathyaram_s sathyaram_s is offline
Super Moderator
 
Join Date: Aug 2001
Location: UK
Posts: 4,534
An alternative to the cursor approach suggested by Marcus, you can also use

Code:
delete from table2 where pkcol in (select pkcol from table2 fetch first 1000 rows only where time between t2 and t2)
__________________
Visit the new-look IDUG Website , register to gain access to the excellent content.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On