Page 1 of 2 12 LastLast
Results 1 to 15 of 20
  1. #1
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6

    Unanswered: fast way to move a clustered index on a half billion rows

    So I just got an email from Production Support saying an hour and a half downtime is unacceptable to move a half billion rows between 2 partitions because I am moving a clustered index and space is a consideration.

    I can not use partition switching because the clustered index is changing.

    This is what I am doing...

    1. I am creating a new table with the new cluster on a new partiton
    2. I am moving the records in 5K set based batches by doing a range search on the existing clustered index on the existing table.
    3. I then reapply all of the nonclustered index from the original table to the new one.
    4. I do a sp_rename swap out.

    The same way I have done this many times before. Is there some new secret special sauce (other than partition switching) I can use?
    Last edited by Thrasymachus; 02-29-12 at 10:23.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  2. #2
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Provided Answers: 11
    Is there going to be any insert/update activity on the data?

  3. #3
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    My current plan involves down time. Application and sql agent turned off. This is hypocrisy and might be personal towards me. I know another team had a 2 hour deployment last month.
    Last edited by Thrasymachus; 02-29-12 at 11:15.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  4. #4
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Provided Answers: 11
    Your approach sounds good enough to me. Unfortunately, my opinion may be of little value, due to the smallish nature of the databases I get to work with.

    I might consider setting auto_update_statistics for the database off during the transfer, but I doubt it will get you a large performance boost.

    How does the new clustered index look compared to the existing clustered index? If they are similar, I would be curious if it is better to create the index at the end, or let SQL Server try to rebalance the index tree every few million rows.

  5. #5
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    These are "history" tables fed by triggers to track changes. The tables are currently clustered the on the surrogate of the original tables ostensibly to optimize read operations. But this is one of the central tables and almost every IUD process ends up writing to them, and there are only about 6 processes that read from them. (as opposed to 100s of iud processes). One of our "processing" stored procs started spinning out of control because of the page splitting that was occuring. So I am moving the cluster on these tables to the surrogate of the history table and the surrogate of the live table will be unclustered.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  6. #6
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    I have to align the cluster with the partition function though, right? that is the catch there. Also I think we are manually updating statistics in a job. This head Prod Support, was like "maybe the DBAs can fix it". The DBAs come running to me for help all the time. One day I woke up and I was the adult in the room. I am not sure when it happened.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  7. #7
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Provided Answers: 11
    It may be a very good idea to have the clustered index match the partition function, but I do not see anything too specific about it in a 2 minute scan of BOL. It looks like you can have the partition function apply to almost any column, regardless of the clustered index. That would make sense, if you were partitioning on something with woefully low cardinality like department.

    If you change the clustered index to the surrogate of the history table, the reports may gain performance problems, in that they now have to (potentially) read all partitions to collect all of the history of one parent record. That may make for an impressive table scan. And the locking might come back to haunt you. How is the QA system?

    OK, I admit, I am kind of thinking out loud here, and none of this could be useful.

  8. #8
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    My best QA person is gone. I end up doing a lot during QA\verifcation myself to help the QA team because their database skills are not that great. I gauged the performance impact on the small number of reports and the other read operations. There is some drag, but acceptable. Most of the reporting uses wildly optimistic concurrency. This table gets written too many multitudes more times than read from. It will work out. I am also standing up a seperate replicated read only server, so this will be less of a concern.

    I think I might tell the Support guys to suck it up.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I have a really perverse idea, but it ought to work. It will take two outages of a few seconds each, one at the beginning and one at the end of the row migration.

    1) Build new archive data with scratch name (I'll use AA here).
    2) Rename the existing table (BB) to a wholly new name (CC)
    3) Create a view over both CC and AA, and name that view BB
    4) Move (copy and delete) rows from CC to AA in a transaction
    5) Drop the BB view
    6) Rename table AA as BB

    Bear in mind that this is a Gedankenexperiment, not a tested procedure. Your milage may vary!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    I like it. The only problem I see that I will still have write operations aimed at the view. I know some views are updateable, but I am not sure, and I would guess not, if a view involves 2 tables in a union. Unless of course if I point the history trigger at CC and that might solve it.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  11. #11
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    can't you bcp the data out?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  12. #12
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Heck, point the trigger at AA if you do anything with it!

    In 90+ percent of the cases, if the two tables are joined without constraints (no criteria in the WHERE clause), SQL Server does the easy thing and plunks the data into the first table referenced in the view.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  13. #13
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by Brett Kaiser View Post
    can't you bcp the data out?
    BCP is fast, but it can't participate in transactions and transactions are a key point in my idea.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  14. #14
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Provided Answers: 11
    You may have to update the view on each transaction. If I recall correctly, the thing that makes the view updateable is that each unioned query had mutually exclusive where clauses. I presume these would have to move along with the transfers, in order for any update activity to land in the correct table.

  15. #15
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    am I missing something?

    How long does bcp take?

    then bcp in to new

    Lock the table

    Insert transactions after bcp

    rename tables

    unlock table

    ???
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •