Results 1 to 5 of 5
  1. #1
    Join Date
    Feb 2009
    Posts
    24

    Unanswered: SQL update split

    I want to run an update on a large table ~ 10,000,000 records, 200 columns and I am not allowed to disable logging :

    Code:
    UPDATE s.my_t SET col_updt =
    CASE
        WHEN col_ck = 'A' THEN 'AA'
        WHEN col_ck = 'B' THEN 'BB'
        ELSE ''
    END;
    COMMIT WORK;
    Since I cannot run this at once it was suggested to split it into several parts like 100. I was think to the following:

    Code:
    UPDATE (
    SELECT pkey, col_ck, col_updt, ROW_NUMBER() OVER (ORDER BY 1) as rownum from s.my_t
    )
    SET col_updt =
    CASE
        WHEN col_ck = 'A' AND rownum between 1 and 100000 THEN 'AA'
        WHEN col_ck = 'B' AND rownum between 1 and 100000 THEN 'BB'
        WHEN col_ck NOT IN ('A','B') AND rownum between 1 and 100000 THEN ''
    END;
    COMMIT WORK;
    
    UPDATE (
    SELECT pkey, col_ck, col_updt, ROW_NUMBER() OVER (ORDER BY 1) as rownum from s.my_t
    )
    SET col_updt =
    CASE
        WHEN col_ck = 'A' AND rownum between 100001 and 200000 THEN 'AA'
        WHEN col_ck = 'B' AND rownum between 100001 and 200000 THEN 'BB'
        WHEN col_ck NOT IN ('A','B') AND rownum between 100001 and 200000 THEN ''
    END;
    COMMIT WORK;
    ...
    and so on replicate for each of the parts. But when I tested this it did not work as wanted - after each commit it does the update but for all the other remaining rows at that moment actually set NULLs instead of doing nothing.
    Is this problem solvable ?

    I have tried to use WHERE ROWNUM BETWEEN ... :
    Code:
    $ db2set -g DB2_COMPATIBILITY_VECTOR=ORA
    DBI1301E  Invalid value.
    Explanation:
    The value specified for the registry variable is invalid.
    I have found similar posts on this forum and a solution when you have to query instead of update but I could not fix my issue ...

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by heartwork93 View Post
    did not work as wanted - after each commit it does the update but for all the other remaining rows at that moment actually set NULLs
    That's because you forgot your WHERE clause.

  3. #3
    Join Date
    Feb 2009
    Posts
    24
    Please can you be more specific. Do you mean to add WHERE clause as:

    Code:
    UPDATE (
    SELECT pkey, col_ck, col_updt, ROW_NUMBER() OVER (ORDER BY 1) as rownum1 from s.my_t WHERE rownum1 between 1 and 100000
    )
    SET col_updt =
    CASE
        WHEN col_ck = 'A' THEN 'AA'
        WHEN col_ck = 'B' THEN 'BB'
        WHEN col_ck NOT IN ('A','B') THEN ''
    END;
    COMMIT WORK;
    but this does not work:

    Code:
    DB21034E  The command was processed as an SQL statement because it was not a valid Command Line Processor command.  During SQL processing it returned:
    SQL0206N  "ROWNUM1" is not valid in the context where it is used.
    SQLSTATE=42703

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Of course this would not work.

    SQL was designed to resemble plain English for a reason. For example, your original query in human language would sound something like this:

    "Select all rows from the table, then update every single row but only provide values for some; if the values are not provided, set columns to null. "

    And that's exactly what you got. Since you did not specify the WHERE clause neither in the subselect nor in the update statement itself, the entire table was the subject of the update, while you only provided new values for the first 100K rows.

    Your second attempt was more precise, although the syntax was wrong:

    "Select a subset of rows from the table, update all rows _in that subset_ providing values for each row."

    You'll need to find a valid criteria for the subset. Another option:

    "Select all rows from the table that have not been updated yet, update a subset of those rows, provide values for each row."

  5. #5
    Join Date
    Feb 2009
    Posts
    24
    Thanks n_i for the hints . This code sequence replicated did the job as desired:

    Code:
    UPDATE (
    SELECT pkey, col_ck, col_updt, ROW_NUMBER() OVER (ORDER BY 1) as rownum1 from s.my_t 
    )
    SET col_updt =
    CASE
        WHEN col_ck = 'A' THEN 'AA'
        WHEN col_ck = 'B' THEN 'BB'
        WHEN col_ck NOT IN ('A','B') THEN ''
    END
    WHERE rownum1 between 1 and 100000;
    COMMIT WORK;
    ...
    and so on

    Code:
    ...
    WHERE rownum1 between 100001 and 200000;
    ...
    WHERE rownum1 between 200001 and 300000;
    ...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •