Results 1 to 13 of 13
  1. #1
    Join Date
    Nov 2002
    Posts
    272

    Question Unanswered: Changing datatype in big table: best way?

    Unfortunately, we ran out of integers in a PK column... (Bad design, I know)

    We have a table with an Identity column of type int, and since it contains {maxint} rows, no new rows can be added. So that's jst over 2 billion records, and the total data size of the table is about 90 GB.
    So now we want to change the data type to bigint.

    What we tried to do it with a simple Alter Table statement, but that results in a logging problem; the 590 GB log file disk runs out of space before the statement is done.

    What is the best way to change the data type from int to bigint in a table that contains 2 billions rows?

  2. #2
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    Possibly create another table with bigint and copy data from older table then; drop older table and rename new table. I would also suggest no keys or index on new table until after it is populated for faster loading. Just my 2 kopets. I am sure there are other ways around this problem that someone else can recommend.

  3. #3
    Join Date
    Nov 2002
    Posts
    272
    Actually, that's an idea we played with, too. We were also thinking to do the copying in chunks, to prevent the transaction log from getting too big.

    This is what we'll try next, after we get the situation back to where it was before last night's failed attempt.

    We're still open to other suggestions, though!
    Last edited by ivon; 03-18-10 at 06:55.

  4. #4
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    A clustered index on the identity would be faster loading than a heap.

    I would guess your problem is due to the splits caused by allocating 4 bytes per row. If you defragment your indexes, you could try lowering the fill factor next time to create more free space in the pages.

    Another alternative could be to use the negative range of the int.

    I assume you don't have any FKs on this either?

    I admit these are just ponderings - I can't recall having a problem like this before.

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Also, is this FULL or SIMPLE recovery?

  6. #6
    Join Date
    Nov 2002
    Posts
    272
    It's Simple recovery.

    I'm not sure what to make of your comment about the page splits. Do you mean that that is the cause of the huge transaction log?

    Using negative integers is a neat idea, but it would lead to the same problem in 5 more years, and then the table would be twice as big.
    Last edited by ivon; 03-18-10 at 07:05.

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I can't think of anything else that could cause the log to grow like that apart from page splits. I've worked with some pretty big datasets and ended up with some sizable logs in the process, but 590 GB is beyond what I've ever produced.

    If the recovery model is simple then your batch idea will work - but you need to execute a CHECKPOINT command on every iteration to ensure SQL Server recycles the log.

  8. #8
    Join Date
    Jun 2004
    Location
    Long Island
    Posts
    696
    bcp out then bcp in to new table.

  9. #9
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    I think they mentioned inserts which cause a record to be written for each record inserted. The bcp option noted above (and I eluded) can be used with a fast insert when no keys or indicies are allocated to table. Then add the pk/fk's as required. No copy or inserts, please....

    Quote Originally Posted by pootle flump View Post
    I can't think of anything else that could cause the log to grow like that apart from page splits. I've worked with some pretty big datasets and ended up with some sizable logs in the process, but 590 GB is beyond what I've ever produced.

    If the recovery model is simple then your batch idea will work - but you need to execute a CHECKPOINT command on every iteration to ensure SQL Server recycles the log.

  10. #10
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    Quote Originally Posted by PMASchmed View Post
    bcp out then bcp in to new table.
    My original suggestion.

  11. #11
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    Best thing about BCP, is you can specify a batch size, so you will not have to break up the import yourself.

  12. #12
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by corncrowe View Post
    I think they mentioned inserts which cause a record to be written for each record inserted. The bcp option noted above (and I eluded) can be used with a fast insert when no keys or indicies are allocated to table. Then add the pk/fk's as required. No copy or inserts, please....
    Nope - monotonically increasing clustered indexes are faster for inserts than heaps. I can provide the references if required.

    BCP has an "optimise for order" option too.

  13. #13
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    snip -- ok -
    Last edited by corncrowe; 03-18-10 at 17:40.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •