In the above example, you only need to collect distribution stats on 3 values. So it depends on whether you have highly skewed data, and how many values there are of the column. If there are 32767 different values, then DB2 will likely use the index anyway, even if the data is not skewed, so you definitely do not want to capture that many values.
Originally Posted by ARWinner
If you capture the average number of rows that exist on a page (depends on the row size and page size) that should be plenty. The reason for that is that DB2 will avoid using an index if it assumes that it has to read every page, and if there are more values than there are rows on a page, then DB2 is much more likely to use an index even if the data is not skewed.
In order to make it work, you should use the following runstats command:
runstats on table <table-name> with distribution on key columns and indexes all;
Generally, it only makes sense to capture distribution on key (index) columns, since a skewed distribution typcially affects whether or not an index is used by DB2. Capturing too many stats makes the catalog much larger, and slows down package compilation time.
Last edited by Marcus_A; 05-27-08 at 15:52.
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390