Hi - I have a problem with an update (with a join) that I'm attempting to run on a table with 53 million records in.
Basically I set this query off for the first time and it chugged away for 96 hours before I killed it - something was awry.
I've restored the database to another server (just in case we have an issue with disks, compress, memory etc) and run some tests on samples of the 53 million and for low samples, the update runs in a reasonable time which increases in time directly in line with the increase in sample size:
Top # sample from database tests:
1,000,000 - 36 seconds
2,000,000 - 71 seconds
3,000,000 - 122 seconds
but once I try 4,000,000 it runs and runs - got up to 36 minutes before I cancelled it.
Can anyone see any reason for this? I don't think the query size is going up exponentially - because if you graph the sample size & time from 1-3 million, the line is linear.
I'm currently adding a record ID to the table so I can select the bottom 4,000,000 so I can be sure that there's not some weird data somewhere between 3-4mill that is making the query go whoopsie.
Here are the tables & query I am attemping to run:
SET MailingHistory_Sample.ERIValue = ListCategoryHierarchy2.[ERI Value]
WHERE MailingHistory_Sample.[SourceCode] = ListCategoryHierarchy2.[SourceCode ID]
Each table is clustered on SourceCode ID/Sourcecode
As I've mentioned, each table is index (clustered) on SourceCode ID/Sourcecode.
We've actually managed to update all the 54 million rows in this table by adding an identity, and running this update in batches of 3million each - this took only a few hours.
Its definitely a volume issue which I hit sometime after 3million where the query just runs for days and doesn't (as far as I can see) complete.
Any ideas what this might be?
If your database is running in Full Recovery mode, a complete before and after image of the update must be stored in the transaction log. If the log file is situated on the same physical drive as the primary data store, has too small an auto-grow value, or is badly physically fragmented, the overall update time can become very, very long.
Options include: issuing an ALTER DATABASE command and backup to take the database into Simple Recovery before the update; relocate the transaction log to a dedicated physical drive; set the transaction log to very, very big and do not truncate its space to filing system when backing up.
Ah thanks - that makes sense. I only learnt this morning about putting data files and transaction logs on different physical disks. That is the case for this database so we are going to slap in a new drive and split the two.