Results 1 to 7 of 7

Thread: Choice of index

  1. #1
    Join Date
    Apr 2011
    Posts
    38

    Unanswered: Choice of index

    I have a table with three fields, let's call them username, eventDate and eventType. I have millions of rows of data in it, and I have two indices created for this table already: idx_event which indexes username and eventDate together, and idx_type which only indexes the eventType field. Both indexes support reverse lookup.

    I want to run queries that all look like this:

    SELECT username, eventDate
    FROM myTable
    WHERE eventType IN (....)
    GROUP BY username, eventDate

    The only part that changes from query to query, is the IN part.

    I wanted to optimize my query so I ran EXPLAIN for one such query, and here is what I found in my SYSTOOLS.EXPLAIN_OPERATOR table:

    Code:
    OPERATOR_ID	OPERATOR_TYPE	TOTAL_COST	IO_COST		CPU_COST	FIRST_ROW_COST
    1		RETURN		635.29		84		1022526.94	635.29
    2		GRPBY 		635.29		84		1022526.94	635.29
    3		TBSCAN		635.29		84		1017026.94	635.29
    4		SORT  		635.29		84		1010239.94	635.29
    5		NLJOIN		635.28		84		981343.19	30.26
    6		TBSCAN		0.00		0		720.00		0.00
    7		FETCH 		30.26		4		92573.96	30.26
    8		IXSCAN		22.70		3		80753.96	22.70
    If I understand this correctly, it reads from bottom to top; first an index (I think it is the idx_type) is used to fetch the reslut of where part, but then there is a join (I think it is the IN part), and then a SORT and GRPBY which is for the GROUP BY.

    Assuming I got it right, it seems that the rate limiting factor is the NLJOIN part. How can I used indices to make it more efficient?

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by merik View Post
    I have millions of rows of data in it, and I have two indices created for this table already: idx_event which indexes username and eventDate together, and idx_type which only indexes the eventType field.
    I'm willing to bet that the cardinality of the eventType index is too low for that index to be useful. You may want to add that column to the other index, at least you will get index-only access.

    Quote Originally Posted by merik View Post
    I wanted to optimize my query so I ran EXPLAIN for one such query, and here is what I found in my SYSTOOLS.EXPLAIN_OPERATOR table
    You should really use db2exfmt to extract explain information.
    ---
    "It does not work" is not a valid problem statement.

  3. #3
    Join Date
    Apr 2011
    Posts
    38
    Quote Originally Posted by n_i View Post
    I'm willing to bet that the cardinality of the eventType index is too low for that index to be useful. You may want to add that column to the other index, at least you will get index-only access.
    If by cardinality you mean the number of unique values, it is in the order of tens of thousands. In what order should I index the three columns then? eventType then username then eventDate? Or does it matter?

    Quote Originally Posted by n_i View Post
    You should really use db2exfmt to extract explain information.
    I will try to learn it! Thanks for the direction.

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by merik View Post
    If by cardinality you mean the number of unique values, it is in the order of tens of thousands.
    In other words, about 1/100th of the table cardinality, so the optimizer is unlikely to use it for the WHERE clause. I'd add it after the other two columns. However, you may want to try running your query through db2advis - DB2 Design Advisor command.
    ---
    "It does not work" is not a valid problem statement.

  5. #5
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by merik View Post
    If by cardinality you mean the number of unique values, it is in the order of tens of thousands.
    In other words, about 1/100th of the table cardinality, so the optimizer is unlikely to use it for the WHERE clause. I'd add it after the other two columns. However, you may want to try running your query through db2advis - DB2 Design Advisor command.
    ---
    "It does not work" is not a valid problem statement.

  6. #6
    Join Date
    May 2013
    Posts
    3
    the index is not effective by the in clause
    SELECT username, eventDate
    FROM myTable
    WHERE eventType =‘’’ and eventType =’‘ 。。。

  7. #7
    Join Date
    Apr 2011
    Posts
    38
    Fantastic! I ran db2advis -d MY_DB_NAME -n MY_SCHEMA_NAME -s "..." where the I pasted my SELECT query in place of the ..., and it took two seconds for the program to tell me that I should create an index on all three columns (it gave me the CREATE statement for that index too).

    Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •