Results 1 to 8 of 8
  1. #1
    Join Date
    Jul 2010
    Posts
    4

    Unanswered: Delete after Multi-Fetch

    I am trying to delete rowsets after fetching them using ROWSET POSITIONING in cursor.

    Cursor looks like this:
    DECLARE CUR_A CURSOR
    WITH ROWSET POSITIONING
    WITH HOLD FOR
    SELECT R_NR FROM TABLE_A
    WHERE R_NR IN
    (SELECT R_NR FROM SESSION.SDYTMP)

    Delete looks like this after fetch:
    EXEC SQL
    FETCH NEXT ROWSET
    FROM CUR_A FOR 100 ROWS
    INTO :WS-R-NR
    END-EXEC

    EXEC SQL
    DELETE FROM TABLE_A
    WHERE CURRENT OF CUR_A
    END-EXEC

    Now, u can see in the DECLARE CURSOR that i used a temporary table (SESSION.SDYTMP). Reason being, I had to perform this same delete on 10 other tables satisfying the same condition (which is a set of IDs). To retreive the IDs, I had to join 3 tables. So, instead of doing the join of 3 tables in the WHERE clause of the DECLARE CURSOR for each of the 10 tables, I preferred to do the JOIN just once and store all the IDs in the temporary table SESSION.SDYTMP.

    Now, I tried to delete 80 records (overall in all the 10 tables) using this program. It took 20 minutes, which is unacceptable. In DETECTOR tool, I could see that the FETCH statements for all the 10 tables are the ones that take more time and GET PAGES is a LOT! I am wondering why it takes so much time. Is it something to do with the accesing temporary table or should I do indexing or is it anything to do with the rowset number being 100? I commit every 300 records. Please help me how I can improve the performance of this program.

  2. #2
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Stop using the temp table. All that process does is add time onto whatever processing you are attempting to do. A temp table was introduced to allow you to put non-DB2 data into the database to allow you to use SQL against it and, possibly, join with other tables. Almost every other reason I have seen it used has been due to a bad process or someone being lazy.
    Are you sure about what is taking all of the time? You spoke of deleting 80 records from your temp table taking 20 minutes, but you follow it up with detector showing it was fetch for all the 10 tables, which you did not show us here.
    How about starting with what you are after in this process and see if we can help you improve the real issue?
    Dave Nance

  3. #3
    Join Date
    Jul 2010
    Posts
    4

    Thanks Dave

    Thanks for ur time Dave. Let me explain the process. We are trying to Purge/Delete old data (older than 2 yrs) from a bunch of tables (11 tables approx).

    This is the hirearchy of tables.
    Parent: BATCH table
    Child of BATCH table: TRANSACTION table
    Child of TRANSACTION table: 10 child tables exist.

    The parent child relationship is not established between tables in the database (no foreign keys in child tables). The relationship exist as a business rule (meaning, there is one common column in all the child tables using which we can relate the data).

    Common column between BATCH and TRANSACTION table: BATCH_ID
    Common column between TRANSACTION and the 10 child tables: TRAN_ID

    I have a 'Purge indicator' in the BATCH table. I update the 'Purge Indicator' to 'Y' in the BATCH table using a COBOL program (Program A) based on some conditions (how old is the data, whether its processed successfully etc.,).

    I have another COBOL program (Program B) that does the following.

    1) Selects all the rows in the BATCH table where 'Purge indicator' = 'Y'. Store all the BATCH_IDs from the result in a temporary table (SESSION.BATCHTMP).

    1) Select all the rows from the TRANSACTION table where BATCH_ID (on TRANSACTION table) = BATCH_ID stored in temporary table. (This will get me all the child rows that exist in TRANSACTION table for the parent table BATCH). Store all the TRAN_IDs from the result in another temporary table (SESSION.TRANTMP).

    2) Now, DECLARE CURSOR for one of the 10 child tables of TRANSACTION table with ROWSET positioning. Lets call this table as ORDER table.

    DECLARE CUR_ORDER CURSOR
    WITH ROWSET POSITIONING
    WITH HOLD FOR
    SELECT TRAN_ID FROM ORDER
    WHERE TRAN_ID IN
    (SELECT TRAN_ID FROM SESSION.TRANTMP)
    FOR UPDATE

    3) FETCH 100 ROWS each time in a loop and DELETE them:
    FETCH NEXT ROWSET
    FROM ORDER FOR 100 ROWS
    INTO :WS-TRAN-ID

    DELETE FROM ORDER
    WHERE CURRENT OF CUR_ORDER

    4) Do steps 2 and 3 above for all the other 9 tables.

    5) Once all the rows are deleted from the 10 child tables, then DELETE the rows from TRANSACTION table.

    6) Update the 'Purge indicator' in the BATCH table from 'Y' to 'P'.

    Note 1: COMMIT is done every 300 records.
    Note 2: The process took more than 20 mins to DELETE a total of 80 rows (combined from all the tables). Except the BATCH table, I delete all the selected rows from all the other tables mentioned in this post.

    I agree that I have to remove using temporary tables. But, if I use the JOIN that selects the rows from all the bottom 10 child tables, then thats going to cause performance issue too. Can you please suggest me how I can improve the performance?
    Last edited by db2gcs; 07-20-10 at 15:25.

  4. #4
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    couple of ideas here.

    1. Do not fetch the records you want to delete. Use a delete statement that joins each of your 10 tables with the upper 2 tables, to check the purge flag and delete 300 records then commit.

    2. An option that I have employed in the past and should work well for you, since you have your purge flag and you update the purge flag on just one batchid at a time. Put the purge flag into all of the tables you wish to purge. Put an update trigger on your batch table, when purge flag = y then update your transaction table. Put an update trigger on your transaction table, when purge flag = y then update your 10 children. Now instead of having a program doing all of these deletes run a reorg with discard from the bottom tables up. Caveat, if not on V8 or higher, which you should be, then the tables would have to be in read only mode while the reorgs run.
    Dave Nance
    Dave

  5. #5
    Join Date
    Jul 2010
    Posts
    4

    Calrifications please

    Dave, Thanks again. In the meantime I look at options 2 and 3, I have a few questions on option 1 you have advised me me. Can you please clarify them? Thanks again.

    What I understand from your option 1 is:

    1) Do not use fetch:
    Which means, u are asking me to remove the CURSOR logic? No more CURSOR?

    2) Use delete statement with a join on upper 2 tables:
    I got this. No questions.

    3) Delete 300 records then commit:
    As per my understanding of your option 1, I am not going to use CURSOR. Instead I will use DELETE statement. If I use DELETE statement, I will DELETE all the satisfying rows in my child table (for eg: ORDER table) at once. As you know that I cannot delete thousands of row in one DELETE as we will hit MAX LOCK issues etc., And, also how can I commit 300 records each time if I try to delete them all at once. I think I am missing something here.

    Can you please clarify? Thanks

    Note: We are going to run this purge program atleast once in 2 weeks and expected volume to delete in each run is approx 1 million records (combining all the rows in all tables).
    Last edited by db2gcs; 07-20-10 at 16:11.

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    It will be easy to delete rows in 10 child tables related with every nn rows of purge_indicator = 'Y' in batch table
    and update purge_indicator in batch table by a statement.

    Here is an example:
    Code:
    WITH batch_purge(tran_id) AS (
    SELECT tran_id
     FROM  batch       b
     INNER JOIN
           transaction t
       ON  t.batch_id = b.batch_id
     WHERE b.purge_indicator = 'Y'
    FETCH FIRST 10 ROWS ONLY
    )
    , delete_c01(delete_count_c01) AS (
    SELECT COUNT(*)
      FROM OLD TABLE
           (DELETE FROM order
             WHERE tran_id
                   IN (SELECT tran_id
                         FROM batch_purge
                      )
           ) d
    )
    , delete_c02(delete_count_c02) AS (
    SELECT COUNT(*)
      FROM OLD TABLE
           (DELETE FROM child_02
             WHERE tran_id
                   IN (SELECT tran_id
                         FROM batch_purge
                      )
           ) d
    )
    .....
    , delete_c10(delete_count_c10) AS (
    SELECT COUNT(*)
      FROM OLD TABLE
           (DELETE FROM child_10
             WHERE tran_id
                   IN (SELECT tran_id
                         FROM batch_purge
                      )
           ) d
    )
    , update_indicator(purge_count) AS (
    SELECT COUNT(CASE WHEN purge_indicator = 'Y' THEN 0 END)
      FROM FINAL TABLE
           (UPDATE batch
               SET purge_indicator = 'N'
             WHERE batch_id
                   IN (SELECT batch_id
                         FROM batch_purge
                      )
           ) u
    )
    SELECT delete_count_c01
         , delete_count_c02
    .....
         , delete_count_c10
         ,   delete_count_c01
           + delete_count_c02
         .....
           + delete_count_c10
           AS delete_count_all
         , purge_count
      FROM delete_c01
         , delete_c02
    .....
         , delete_c10
         , update_indicator
    ;
    Last edited by tonkuma; 07-20-10 at 18:11. Reason: Add UPDATE batch and "and update purge_indicator in batch table".

  7. #7
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Aside from Tonkuma's suggestion there are many ways to write the SQL to limit your deletes to 300. I bet Tonkuma and Lenny could throw at least 10 of them at you today, I just don't have the time to do more than give you some high level hints at the structure.
    I wanted to suggest you take alook at the other option I offered up as well, as it will always be the cheapest fastest method for performing your purge. There would be no programs for your staff to maintain, as IBM does all the maintenance work on utilities and constantly improves their performance(have you seen some of the documentation coming out about V10?). There would be no logging. You are going to have to follow up your purge process with periodic reorgs anyway.
    One other item I failed to mention earlier is that there are some great archival tools out there that can be purchased and set up to either keep or just purge the data and they will issue commits at the points that you wish.
    Dave Nance

  8. #8
    Join Date
    Jul 2010
    Posts
    4

    Thanks tonkuma and Dave!

    Thanks guys! I have a small update on my process. I changed the following 2 numbers and ran my process again. It took 4 mins to delete 500,000 records as opposed to 20 mins to delete 80 records.

    Changes done:
    1) Commit frequency changed from 5000 to 300
    2) Multi-fetch changed from fetching 1000 records to 100 records.

    I don't know, but it did the magic. However, i will look into the options you guys suggested. Thanks again for ur time.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •