Results 1 to 9 of 9
  1. #1
    Join Date
    Mar 2003
    Posts
    69

    db2 delete -- Interesting Question

    Hi All,
    I have a very interesting question. I have a table with around 50 million records. Out of that, 30 million records can be deletable. Now my delete from that table query always fails saying the "time out " and can not able to delete any of the records. So I wrote a visual basic program that can execute in the night and i am trying to delete 10 thousand records at shot. So I am able to delete some 50 - 80 thousand records in the whole night ( as i can not do it in the morning time in production). But again the next day it is getting added with some 40 thousand to 60 thousand records. So what is the best way I need to go for deletion ?

    - Thanks,
    Anto

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,452

    Re: db2 delete -- Interesting Question

    Originally posted by antodomnic
    I have a table with around 50 million records. Out of that, 30 million records can be deletable. Now my delete from that table query always fails saying the "time out " and can not able to delete any of the records.
    Some options that come to mind:

    - define the table as NOT LOGGED INITIALLY. Deactivate logging before you start deleting records;
    - export the records you would like to keep; then LOAD ... REPLACE them back into the table;
    - drop indexes on the table before deleting records; re-create indexes when you're done;

    Also, if you have any ON DELETE CASCADE constraints on that table removing them will greatly improve performance.

  3. #3
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,649

    Re: db2 delete -- Interesting Question

    IMHO, n_i's second option is the best for your scenario ... If you are deleting 60% of a 50 million row table, there will be a huge number of empty/partially filled pages which will be a huge hit on performance ...

    It depends on the refrential constraints you have on the table ... If you don't have any, my option will be to create another table with a similar structure, export the 20million records you want and load into the new table , test the data ... When you are happy, drop the original table and rename the new table .... You might need to re-create aliases, views, triggers, etc ... This will be easy to accompolish and also post-change performance will be good ....

    BTW, 50-80K records in a night(how many hours?) seems to be a small number ...

    Alternatively, you can write a stored procedure to delete the records, committing, may be, every 10000 records ... Here's a sample :
    http://dbforums.com/showthread.php?t...17#post1715617

    Cheers

    Sathyaram



    Originally posted by n_i
    Some options that come to mind:

    - define the table as NOT LOGGED INITIALLY. Deactivate logging before you start deleting records;
    - export the records you would like to keep; then LOAD ... REPLACE them back into the table;
    - drop indexes on the table before deleting records; re-create indexes when you're done;

    Also, if you have any ON DELETE CASCADE constraints on that table removing them will greatly improve performance.
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  4. #4
    Join Date
    Mar 2003
    Posts
    343
    Alternatively, create another table with the same partitioning key but without any constraints or indexes(including PK) and in the same schema but using not logged initially. Alter the session, activate not logged initially and insert into the new table the rows you want to keep. Since this is a collocated not logged non-constrained insert it should be very fast. Then rename first the current table to maybe _old and then the _new table to the current name and follow Satyarams's other instructions. For the renames. ensure that the tables are in the same schema.

    We have done this and moved millions(in the 100's) of rows in 2-5 hours. It also reorgs the table. You get to keep the original for verification and do not have to wait for the export.

    Hope this helps

  5. #5
    Join Date
    Jun 2003
    Location
    Canada
    Posts
    37

    Re: db2 delete -- Interesting Question

    If you want to avoid having to run a separate RUNSTATS, collect some inline stats from LOAD/REORG. This tends to be faster than running RUNSTATS as an add-on job.

    Cheers,

    Julius

  6. #6
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,649

    Re: db2 delete -- Interesting Question

    Remember, collecting statistics is possible only if you use the REPLACE Option ...

    Also, you can create indexes on the table before loading the data(if you are using LOAD) ... This will perform better than adding indexes after the LOAD

    Cheers

    Sathyaram


    Originally posted by jsasvari
    If you want to avoid having to run a separate RUNSTATS, collect some inline stats from LOAD/REORG. This tends to be faster than running RUNSTATS as an add-on job.

    Cheers,

    Julius
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  7. #7
    Join Date
    Mar 2003
    Posts
    343
    If this is a large table, I disagree that indexes should be built before loading the data. It is true that the index is physically created after the load, but what I have noticed is that the sorting required to build the index takes place while the load is in progress. And for a large table with multiple indexes, it is a significant overhead. One example is we have a table which currently has 3.8B rows across 34 logical nodes. In september, 2002, it had approx. 2.3 B rows and we needed to migrate the rows over to a new table to reorg it. I thought it would be good to keep the PK on the new table for the reasons mentioned above. It took extremely long to load(order of days). Following some issues in May we had to export and load the whole DW - at that time this table had about 3.4B rows. This time we dropped all indexes including PK and the table took between 6 and 8 hours to load. I subsequently learnt that it was the index sort which affects the perfomance.

    Hope this helps.

  8. #8
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,649
    I'm quiet surprised that the sort for index building 'starts' before the load phase completes ... (though have never checked if there is a sort during LOAD Phase) Maybe that is how it behaves for loads of the size you mention .... I have used LOAD for a table with about 15 million records and I observed that using LOAD on a table with indexes was more efficient than LOAD followed by index creation .....

    As you will know, index building in any case(in the LOAD or CREATE INDEX) will be sorting the data ...

    Cheers

    Sathyaram


    Originally posted by cchattoraj
    If this is a large table, I disagree that indexes should be built before loading the data. It is true that the index is physically created after the load, but what I have noticed is that the sorting required to build the index takes place while the load is in progress. And for a large table with multiple indexes, it is a significant overhead. One example is we have a table which currently has 3.8B rows across 34 logical nodes. In september, 2002, it had approx. 2.3 B rows and we needed to migrate the rows over to a new table to reorg it. I thought it would be good to keep the PK on the new table for the reasons mentioned above. It took extremely long to load(order of days). Following some issues in May we had to export and load the whole DW - at that time this table had about 3.4B rows. This time we dropped all indexes including PK and the table took between 6 and 8 hours to load. I subsequently learnt that it was the index sort which affects the perfomance.

    Hope this helps.
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  9. #9
    Join Date
    Mar 2003
    Posts
    343
    It sorts while the load takes place - I agree that while loading 15 million rows it may not be a significant overhead(we do that everyday), especially if the load is the only process running at the time.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •