I am looking to run UPDATE STATISTICS for the first time, don't ask why it wasn't done prior please , on a set of large tables in our 346gb database whcih has been being populated with transactional data for the past 4 years. The tables contain 1.2, 35, 64, and 92 million rows. I have used the STAT_DATE function to determine that none of these tables have ever had update statistics run for them.
My question is how should I go about this process and what options should I be selecting when issuing the command? I assume that I must first run with the FULLSCAN paramater in order to initially generate statistics for the table then would assume that following this initial population I could run without any paramaters nightly against the tables in the database to keep statistics up to date. Any guideance you all could provide to a newb would be greatly appreciated.
Fullscan is actually not necessary. Estimating statistics should be enough, unless you have unlimited tempdb space. You can divide this up into chunks by doing each index separately. This should help concurrency, as I believe update statistics will take out shared locks on the table. If the single indexes go quickly enough, you can hazard a whole table at once.
Ok that is good as FULLSCAN on the smallest table has run for 4 hours now with no signs of completion Could you please confirm what options I should identify when executing UPDATE STATISTICS? Is it simply UPDATE STATISTICS [Table Name] if I want to update statistics for all indexes on the table or are there other options which are needed to best execute this command? Thanks for the help!
There are not a lot of options for UPDATE STATISTICS on SQL 2000. In your case, I would probably just do SAMPLE 30 PERCENT, and only run for each named index. Like I say, if that runs quickly enough, you can up the percentage on subsequent indeces.
pootle flump, given the size of our most active tables containing 25-65 million rows the amount of data change which would be required to affect an update to statistics for the tables would negate any benefit to enabling the function as would the performance hit we would take when the data change volumes did induce the update of statistics. As this would happen automatically on these tables and has the potential to occur during peak work windows we determined this to not be beneficial. As a solution to this issue we run dbcc dbreindex nightly on critical table indexes which updates the statistics during the course of its execution.
MCrowley, I ran update statistics [table name] with sample 10 percent on one of our larger most active tables, row count 35.6 million rows, and it ran for 7 hours and 50 minutes. Upon further analysis I found that this table is re-indexed nightly which appears to update the statistics based upon the results of dbcc show_statistics following the reindexing.
The function stats_date would probably be a friendlier way of determining the last statistics date for these tables. If the update of the statistics takes almost 8 hours, I hate to think how long the nightly rebuild of the indexes takes.
the nightly rebuild of indexes only takes 1 hour in total and processes all tables within the database where there scandensity is greater than 10%. the fact that our nightly rebuild of indexes takes only an hour is why I am concerned with the execution time for update statistics. After looking at the system I was using to test I found that hardware configuration was inducing the slowness in update statistics execution.