Results 1 to 10 of 10
  1. #1
    Join Date
    Jun 2005
    Posts
    319

    Unanswered: is this a useless index? (on a bit field, including pk-indentity)

    I'm taking a look at indexes on our boxes looking for ways to improve insert performance. So when I look at our front line sql servers I see it has two indexes:

    1) on the primary key (clustered) - this is fine
    2) on a bit field ASC that INCLUDEs the primary key column which is a bigint identity column

    Is the second index a waste or could some select queries benefit from it? If it is fine I will leave it alone, if there is no reason for it I will drop it. I am just hesitant to drop indexes without good cause.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I haven't confirmed this with a Clustered Index, but if you are using SQL Server Enterprise Edition, you can use the Online Indexing fand the Parallel Indexing unctions so that the time needed to build indexes doesn't significantly interfere with queries. You'll still incur CPU and I/O cost, and will almost double the disk usage while the index is built, but there should be minimal query impact.

    Due to lack of selectivity, I can't see ANY purpose for having an index that starts with a bit field. I see that as a flat out waste.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jun 2005
    Posts
    319
    Thanks Pat only Standard, I'd hate to see the licensing cost for 126 servers o.O

  4. #4
    Join Date
    Jun 2005
    Posts
    319
    Eh, one thing is it does look like it is being used, so leave it alone?

    select i.name as indexname,
    object_name(s.[object_id]) as tablename, s.database_id, s.[object_id], s.User_seeks, last_user_seek, user_scans, last_user_scan,
    user_lookups, Last_user_lookup, user_updates, last_user_update
    from sys.dm_db_index_usage_stats s
    join sys.indexes i on s.object_id = i.object_id
    and s.index_id = i.index_id
    where OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
    and i.name = '{myindexname}'

    returns user_seeks from 30k to 5 million
    last_user_seek a few minutes ago
    user_scans 700-188k
    last_user_scan yesterday
    user_updates from 75 to 107k
    last_user_update from today

  5. #5
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    I set up a sample replica of your scenario, and any query that references this field will result in 3-fold improvement in IO statistics if the index is present. Here's the sample code I used to build a scenario:
    Code:
    create table dbo.test (rowid int identity(1,1) not null primary key clustered, flag bit not null)
    go
    set nocount on
    insert dbo.test (flag)
    select case object_id % 2 when 0 then 1 else 0 end
    from msdb.sys.objects
    go 10000
    set statistics io on
    go
    select * from dbo.test where flag = 1
    set statistics io off
    go
    --Table 'test'. Scan count 1, logical reads 21118, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    create nonclustered index idx_test_flag on dbo.test (flag) include (rowid)
    go
    set statistics io on
    go
    select * from dbo.test where flag = 1
    set statistics io off
    go
    --Table 'test'. Scan count 1, logical reads 6002, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  6. #6
    Join Date
    Jun 2005
    Posts
    319
    Good stuff.

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    A great demonstration of the ability to "prove" anything if you can create a test that proves it. You could also claim that water must cause cancer... Given 1000 healthy mice, divide them into two groups. Give one group of mice water, and one group no water at all. Only the mice given water will develop cancer, the mice that get no water will not develop cancer.

    Using a carefully contrived test that caters to the usefullness of an index that begins with a bit column, yoiu can show that the index is useful. I've never seen a query where only the bit column is referenced in all of the code that I've written or reviewed. That one case does make an index that starts with a bit column useful and it does significantly reduce IO as rdjabarov has demonstrated.

    Using rdjabarov's test data, any query that includes the rowid column in the WHERE clause will use the PK index. I contend that this is far more likely in the "real world" and certainly more likely in any app that I've worked on!

    As a side note, Gagnon reports that the index actually is used. In this case, it appears that there may be queries that only reference the bit column, so there probably is some benefit from having the index.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #8
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    What hasn't been mentioned is the distribution of the bit. If it is 99.5% one value and 0.5% the other then even for a large table it could be very valuable even for lookups. I'm thinking of (for example) an is_active bit.

    Robert - remember that argument we had years ago? Your code there demonstrates an index seek -> range scan. That was what I was trying to demonstrate the efficacy of.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  9. #9
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Quote Originally Posted by Pat Phelan View Post
    ... Given 1000 healthy mice, divide them into two groups. Give one group of mice water, and one group no water at all. Only the mice given water will develop cancer, the mice that get no water will not develop cancer...
    Hmmm, that's because mice with no water will die from thirst long before they can develop anything

    On a serious note, an index like this I actually saw many times, and it was very useful in multi-field JOINs and WHERE clauses. It is also very successfully used in EXISTS clauses where the INCLUDEd field is also an index. But hey, I am with you on this, - stop torturing poor mice just because you want to prove something! Just let them all die from cancer!
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  10. #10
    Join Date
    Jun 2005
    Posts
    319
    So I found the exact sql calls to these tables referencing this bit, I have altered rdj's test accordingly...

    Code:
    create table dbo.test (rowid bigint identity(1,1) not null primary key clustered, flag bit not null)
    go
    set nocount on
    insert dbo.test (flag)
    select case object_id % 2 when 0 then 1 else 0 end
    from msdb.sys.objects
    go 10000
    
    select count(*) from test
    --12,080,000
    
    declare @lastRowId bigint
    
    set statistics io on
    
    select @lastRowId = IsNull(max(rowId),0) 
    from dbo.test where flag = 0
    
    -- other code executes...
    
    update test
    set flag = 1
    where rowId = @lastRowId
    
    set statistics io off
    go
    create nonclustered index idx_test_flag on dbo.test (flag) include (rowid)
    go
    declare @lastRowId bigint
    
    set statistics io on
    
    select @lastRowId = IsNull(max(rowId),0) 
    from dbo.test where flag = 0
    
    update test
    set flag = 1
    where rowId = @lastRowId
    
    set statistics io off
    go
    Output:
    Code:
    Beginning execution loop
    Batch execution completed 10000 times.
    Table 'test'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'test'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'test'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'test'. Scan count 0, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Looks like it is not needed, thanks to all for the input.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •