Results 1 to 8 of 8
  1. #1
    Join Date
    May 2004
    Posts
    11

    Unanswered: "MOVE rows procedure" (insert + delete)

    Hi,

    We are in need to implement SQL stored procedure to archive application data rows from "active" table to "archive" table after some period of days. The rows in active table can be somewhat large (they have CLOB fields, data size from 30 - 100 k.) and there usually is about 300 000 - 500 000 new rows per day in active table.

    Because of the number of rows and data size, the transactions are long and fill the transaction logs quickly, these haven't been able to be moved in single huge transaction to archive table.

    For this we have made quite simple SQL procedure, which moves the rows by making insert & deletes for 5000 rows in loop and commits, and then repeats until all rows are "moved" to archive table.

    This SQL procedure has been running fine so far in development environment, but is this design infact faulty of how it selects which rows to delete after inserts (check the sample code) ? The insert and delete makes same query to same table, but is the data that the query returns always in same order between insert and delete executions ?! (are the deleted rows always, in all situations the same that were inserted ?)


    The more I think the more this current design feels faulty Do you have some recommendations or tips what better options there would be for implementing this "move procedure" ? (using temp table ?)

    Using ORDER BY for selects would ensure the order, but I wonder how big performance hit it might cause on bigger data volumes ... ?


    Insert + delete part of the SQL procedure:
    --------------------------------------------
    -- prmNumberOfDays is input DATE parameter to procedure
    SET archiveDate = (CURRENT_DATE - prmNumberOfDays DAYS);
    SET countOfRowsToMove = (select count(*) from APP_ACTIVE where DATE(CREATED_TIMESTAMP) < archiveDate)

    WHILE (countOfRowsToMove > 0) DO

    -- Insert 5000 rows
    insert into APP_ARCHIVE(APP_TIMESTAMP, WORK_ID, APP_DATA, CREATED_TIMESTAMP) (select APP_TIMESTAMP, WORK_ID, APP_DATA, CREATED_TIMESTAMP from APP_ACTIVE where DATE(CREATED_TIMESTAMP) < archiveDate FETCH FIRST 5000 ROWS ONLY);

    -- Delete 5000 rows (are these always same rows than were in insert step ?)
    delete from APP_ACTIVE where ID_APP_ACTIVE in (select ID_APP_ACTIVE from APP_ACTIVE where DATE(CREATED_TIMESTAMP) < archiveDate FETCH FIRST 5000 ROWS ONLY);

    -- Commit change
    COMMIT;

    set countOfRowsToMove = countOfRowsToMove - 5000;

    END WHILE;

    --------------------------------------------

    Thanks for your help.

  2. #2
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    How about you use export to export the data to the temp file. Then use load to put a data into your archive table. This will speed the load process. Add commit statement to load statement. Let that finish. Add a check for the return code to make sure that it finished ok. Then you can safely delete from the original table. No need for fetch.

    This will also assures that you infact moved all of the data before deletion takes place. And using a COMMITCOUNT as a part of the load will prevent you from filling up logs.
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  3. #3
    Join Date
    Jan 2003
    Posts
    4,292
    Provided Answers: 5
    Yes, I would say that your logic is faulty. There is no way to insure that the first 5000 rows in the subselect of the insert statement are the same first 5000 rows in the delete statement.

    Probably a better way would be like this. (I am not 100% sure of the syntax).

    DECLARE moredata int;
    DECLARE commit_counter int;
    declare cursor1 cursor with hold for select * from APP_ACTIVE where DATE(CREATED_TIMESTAMP) < archiveDate;

    declare continue condition for not found set moredata = 0;

    set moredata = 1;
    set commit_counter = 0;
    open cursor1;
    fetch cursor1 into ....
    while (moredata = 1) do
    insert into ...
    delete from APP_ACTIVE where current of cursor1;
    set commit_count = commit_count + 1;
    if (commit_count = 5000)
    THEN
    commit;
    set_commit_count = 0;
    END IF;
    END WHILE;
    COMMIT;


    HTH

    Andy

  4. #4
    Join Date
    Oct 2005
    Posts
    109
    The idea I have:
    How about first load with a cursor to avoid the export?
    logging is avoided then for the load, the deletion can take place in any order, no worrys.

    From what I see you have no issue regarding that this takes too long. otherwise load and even export with following load on the base table when a huge percentage is deleted is an option.
    Juliane

  5. #5
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    with 300 - 500 thousand rows per day. he got to have issue with how long it takes if he is using an insert.
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  6. #6
    Join Date
    Oct 2005
    Posts
    109
    Quote Originally Posted by Cougar8000
    with 300 - 500 thousand rows per day. he got to have issue with how long it takes if he is using an insert.
    :-) no further comment
    Juliane

  7. #7
    Join Date
    May 2004
    Posts
    11
    Andy, thanks for your suggestion it seems prudent, we will try that. Thanks especially for pseudo code, it always clarifies things

    About the export/load batch option ... otherwise yes that would be good option, but there is requirement that I didn't mention in first post, when the archiving will be started (not fixed date/time) should be able to be determined from (java) application (there is some app. specific processing before & after archive job, of course the application could start the export/load batch also ...). But as Cougar8000 there warned, export/load option might be for performance reason thing to consider anyway ...

    Thanks for all your comments!

  8. #8
    Join Date
    Apr 2004
    Posts
    54
    Do like this:
    -- the table from we will move data
    create table t_from (
    id int,
    name char(30)
    )@

    -- the table to we will move data
    create table t_to (
    id int,
    name char(30)
    )@

    -- insert sample data
    insert into t_from values
    (1,'aaaaa'),
    (2,'bbbbb')
    @

    -- check what we have
    select * from t_from@
    select * from t_to@

    ----------------------------------------------------
    -- the next statement moves rows from t_from to t_to
    -- it uses select from delete
    -- and select from insert
    -- and returns number of moved rows
    ----------------------------------------------------
    with
    tf as (select * from old table (delete from t_from where id between 1 and 5000)),
    tt as (select id,name from new table (insert into t_to (id,name) select id,name from tf))
    select count(*) from tt
    @

    -- check what happen?
    select * from t_from@
    select * from t_to@

    -- change to commit if everything is ok!
    rollback@

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •