Results 1 to 2 of 2
  1. #1
    Join Date
    Jan 2012
    Posts
    29

    Unanswered: update on large table

    Hi,
    I need to update a large table, about 55 million rows, without filling the transaction log, in the shortest time as possible.
    The goal is to alter the table and change the data type for Text column from VARCHAR(7900) to NVARCHAR(MAX).
    Since I cannot do it with an ALTER TABLE statement (it would fill up the transaction log) I'm thinking to:
    - rename column Text in Text_OLD
    - add Text column of type NVARCHAR(MAX)
    - copy values in batches from Text_OLD to Text

    The table is defined like:
    create table DATATEXT(
    rID INTEGER NOT NULL,
    sID INTEGER NOT NULL,
    pID INTEGER NOT NULL,
    cID INTEGER NOT NULL,
    err TINYINT NOT NULL,
    multi TINYINT NOT NULL,
    count INTEGER NOT NULL,
    Text VARCHAR(7900),
    repos BIGINT NOT NULL,
    Sequence ROWVERSION NOT NULL
    CONSTRAINT pk_data PRIMARY KEY( rID, sID, pID, cID )
    );
    exec sp_indexoption 'DATATEXT.pk_data', 'AllowPageLocks', FALSE;
    CREATE INDEX IX_CID ON DATATEXT( cID );
    exec sp_indexoption 'DATATEXT.IX_CID', 'AllowPageLocks', FALSE;
    CREATE INDEX IX_Sequence ON DATATEXT( Sequence ) WITH FILLFACTOR = 100;
    exec sp_indexoption 'DATATEXT.IX_Sequence', 'AllowPageLocks', FALSE;


    I've thought about a stored procedure doing this but I'd like an advice on how to copy values in batch from Text_OLD to Text.
    The code I would start with (doing just this part) is the following, but maybe there are more efficient ways to do it, or at least there's a better way to select @startSeq in the WHILE loop (avoiding to select a bunch of 100000 sequences and later selecting the max).

    declare @startSeq timestamp
    declare @lastSeq timestamp
    select @lastSeq = MAX(sequence) from [DATATEXT] where [Text] is null
    select @startSeq = MIN(Sequence) FROM [DATATEXT] where [Text]is null
    BEGIN TRANSACTION T1
    WHILE @startSeq < @lastSeq
    BEGIN
    with sequenceBunch as (
    select top 100000 Sequence from [DATATEXT] where [Text] is null order by Sequence)
    select @startSeq = MAX(Sequence) from sequenceBunch
    UPDATE [DATATEXT] SET [Text]=[Text_OLD] where Sequence < @startSeq
    COMMIT TRANSACTION T1
    END

    Thanks

  2. #2
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    If you can, set the recovery mode to simple as this will log less information than other options.
    Also, if you can, periodically truncate the log (when in simple mode) so as not to fill it.

    It doesn't make a vast difference but I think I would add the new column with a different name, rather than renaming the existing column first. This is only because a rollback might be easier (don't rock the boat until you really have to).

    You are right to do the operation in batches. However I might change your loop logic slightly (again, this is mostly personal preference)
    Code:
    SET NOCOUNT ON;
    
    DECLARE @rc int = 937;
    
    WHILE @rc > 0
      BEGIN
        BEGIN TRANSACTION;
    
          UPDATE TOP (10000) DATATEXT
          SET    Text_New = "Text"
          WHERE  Text_New IS NULL
          AND    "Text" IS NOT NULL
          ;
    
          SET @rc = @@RowCount;
    
        COMMIT TRANSACTION;
      END
    ;
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •