I already posted this over on sqlteam so don't peek there if you haven't seen that post yet.
So now to the question:
Anyone care to guess how long it took me to build a clustered index on a table with 900 million rows? This is the largest amount of data in a single table I have had to work with thus far in my career! It's sorta fun to work with such large datasets.
1. running sql 2005 on a dual proc 32bit server, 8gb ram, hyperthreaded, 3ghz clock. disk is a decent SAN, not sure of the specs though.
2. ddl for table:
CREATE TABLE [dbo].[fld](
[id] [bigint] NOT NULL,
[id2] [tinyint] NOT NULL,
[extid] [bigint] NOT NULL,
[dd] [bit] NOT NULL,
[mp] [tinyint] NOT NULL,
[ss] [tinyint] NOT NULL,
[cc] [datetime] NOT NULL,
[ff] [tinyint] NOT NULL,
[mm] [smallint] NOT NULL,
[ds] [smallint] NOT NULL
3. ddl for index (this is the only index on the table):
CREATE CLUSTERED INDEX [CIfld]
)WITH (FILLFACTOR=100, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF)
4. extid column was not sorted to begin with. ordering was completely random.
Note that I have changed the column names, etc, to protect the innocent. I can't go into details about what it's for or I'd be violating NDA type stuff.
because this table is not transactional. no inserts after the index is built, only selects
Originally Posted by MCrowley
Is extid supposed to be an identity field?
No. It's not a unique column. However is highly selective. requirement was to return ranges of extid also ordered by extid, I found reads were fastest if I built a clustered index on this column just for this purpose.
Originally Posted by Brett
What happens if you load it with now index, unloaded with QUERYOUT sorted by the id, then reload it, and then build the index?
That could be something to try if I ever need to do this again. I didn't look at other methods because 1.25 hrs was more than acceptable for building this index.