Unanswered: head column uniqueness on composite index?
Hello all. I read an interesting article on an IBM website about evaluating the uniqueness of indexed columns. One part that was not clear to me was how a composite indexes columns should be ordered on creation and how that effects the uniqueness measurement. For example, I have a table called tblRecords that has 93416076 rows.
Amongst other indexexs, I have one index in particular thats got me questioning its formation based on IBMs instructions on how to measure uniqueness. The index looks like this:
create index "informix".cl_tapekey_v2 on "informix".tblRecords
(subclient_key,claim_no,claim_line_key) using btree ;
Now in order to measure an indexes uniqueness, I use the following SQL:
select tabname, idxname, nrows, nunique
from systables t, sysindexes i
where t.tabid = i.tabid
and t.tabid >99
and nrows >0
and nunique >0
and tabname = 'tblRecords'
and idxname = 'tblRecords_pk_v2'
And my results look like this:
And again, per the IBM aritcle to get the percentage of uniqueness I use <percentage of uniqueness> = ((nunique/nrows)*100) which yields a pretty low percentage, something like 9.7nnnnnnnnnnn!
I think this is because the number of unique values in my "head column" (first column in index creation statement) of the composite index is a lowly 90. But, I am not sure I am interpretting this correctly or if the "head column" is actually the most important part of the composite index? If the head column is the heavy hitter, should the this always be a column with the most unique data?
What they are trybing to tell in this story, is that you should avoid highly duplicate indexes. And with thgis query you can find those indexes.
First off all why avoid highly duplicate indexes, for this there aretwo reasons:
1. If you still find, let's say, 30% of the rows with an index scan. Then the
filtering is pretty low and still a lot rows have to be scanned.
2. IDS is optimized to work with small index entries. Deleting a complete entry
(= 1 key value) can be done by the b-tree cleaner/scanner in the
background. But deleting a row from a list of rows with this key, must be
done on the foreground. All rowids after the deleted row must be shifted
one position. This is really bad.
Especially to prevent the second problem IBM recommend to avoid highly duplicate indexes. You can do this by adding an extra field to the index (at the end) which makes the key nore unique. This can be done for exapmple by adding the primary key. This looks ridiculous at first sight, but it achieves two goals:
1. (More) unique keys. So easier/faster deleting of rows (and in the
2. Same search result and performance
Remember the nunique is not for the head column but for the whole index.
I would differ and disagree on RobP's statement:
"Remember the nunique is not for the head column but for the whole index."
I would like to point-out that nunique values are separate for each indices. That means if a table has 3 indices it would have 3 different nunique values. But the nunique value is referred and holds valid for the first column of the index *only* and not to all the columns that makeup a composite index.
Update statistics statement updates nunique column value and optimizer uses this value for internal purpose of getting the best out of a table & run a query as efficiently as possible.