| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

09-21-10, 18:05
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 248
|
|
|
Two indexes on the same table?
|
|
Hi,
I have a table called TEST with these columns --
PID -- integer
PTYPE -- varchar
ID -- integer
PDATE -- TIMESTAMP
PNAME -- varchar
VAL -- double
I have a couple of different procedures that access data from this table --
In one of the procedures, the select query on table TEST is based on .. WHERE pid= AND ptype= AND id =
And, in a few others, the select query on table TEST is based on .. WHERE pid= AND id = AND pdate =
Considering the above, would it make sense to create two indexes on table TEST? Thanks!
CREATE INDEX test1_idx ON TEST (PID, PTYPE, ID)
CREATE INDEX test2_idx ON TEST (PID, ID, PDATE)
|
|

09-21-10, 19:22
|
|
Registered User
|
|
Join Date: May 2009
Posts: 472
|
|
db2user24, Since all of the Where operands are =, you really have a choice between
PID, ID, PTYPE
and/or
PID, ID, PDATE
The answer depends on the cardinality of the columns.
If PID and ID filter down to very few rows, then you probably don't need PTYPE or PDATE.
If PID and ID still returns a LOT of rows and the extra filtering of PTYPE and/or PDATE returns only a few rows, they either one (or both) may be better.
Also what (if anything) is your Unique key? If, for example, PID is unique, then you really don't need to index any other column, since PID is in the Where clause with an = (it would return only 1 row by itself).
With nothing else to go on, I would suggest creating one index at a time and run both queries with explain. See if the access path and run time significantly change. (This is assuming the table has been populated with data).
There are other reasons to create an index. For example, by adding some columns in a particular order, you might avoid a sort.
|
|

09-21-10, 20:14
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
|
|
Please provide the number of rows in the table, and the number of unique values for each column in your proposed indexes:
select count(*) from table-name; -- to count the rows
Select count(distict PTYPE) from table-name; -- to count the number of unique values in PTYPE
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

09-22-10, 00:00
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 248
|
|
thanks for the responses! the number of rows in the table is always different.. really depends on the date range the user gives ---
|
Last edited by db2user24; 09-22-10 at 01:46.
|

09-22-10, 00:04
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
I don't mean to be rude, but if you want an accurate answer, please supply the information requested. It doesn't matter what the query looks like, it matters what the data in the table looks like in terms of the number of unique occurances (all of the data).
If the data is not fully populated yet, you will have to guess on the numbers.
Also, is there a primary key on this table?
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

09-22-10, 00:48
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 248
|
|
There is no primary key on the table but the PID, PTYPE, ID, PDATE combination is always unique.
Again, the count of records varies on the date range and interval the user selects.
Example of the PID, PTYPE, ID, PDATE combination ---
100, Site, 100, 1/1/2010 02:00
100, Site, 100, 1/1/2010 03:00
100, Site, 100, 1/1/2010 04:00
100, Site, 100, 1/1/2010 05:00
100, Site, 80, 1/1/2010 02:00
100, Site, 80, 1/1/2010 03:00
100, Site, 80, 1/1/2010 04:00
100, Site, 80, 1/1/2010 05:00
|
Last edited by db2user24; 09-22-10 at 01:58.
|

09-22-10, 01:16
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
db2user24,
I think that it will be enough to show the result of the following query, to answer the Marcus_A's question.
You don't need to edit the result nor to execute other query, if your data was fully populated.
Code:
SELECT COUNT(*) AS count_rows
, COUNT(DISTINCT PID) AS dist_pid
, COUNT(DISTINCT PTYPE) AS dist_ptype
, COUNT(DISTINCT ID) AS dist_id
, COUNT(DISTINCT PDATE) AS dist_pdate
, COUNT(DISTINCT PNAME) AS dist_pname
, COUNT(DISTINCT DIGITS(PID) || PTYPE ) AS dist_pid_ptype
, COUNT(DISTINCT DIGITS(PID) || DIGITS(ID) ) AS dist_pid_id
, COUNT(DISTINCT DIGITS(PID) || PTYPE || DIGITS(ID) ) AS dist_pid_ptype_id
, COUNT(DISTINCT DIGITS(PID) || DIGITS(ID) || CHAR(PDATE) ) AS dist_pid_id_pdate
FROM table-name
;
|
Last edited by tonkuma; 09-22-10 at 01:24.
|

09-22-10, 01:54
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 248
|
|
thanks, i just ran an example and this is the result --
total count of rows in table = 2232
Distinct ( PTYPE ) = 1
Also, here are the results for the above query :
COUNT_ROWS DIST_PID DIST_PTYPE DIST_ID DIST_PDATE DIST_PNAME DIST_PID_PTYPE DIST_PID_ID DIST_PID_PTYPE_ID DIST_PID_ID_PDATE
----------- ----------- ----------- ----------- ----------- ----------- -------------- ----------- ----------------- -----------------
2232 1 1 3 744 1 1 3 3 2232
1 record(s) selected.
COUNT_ROWS = 2232
DIST_PID = 1
DIST_PTYPE = 1
DIST_ID = 3
DIST_PDATE = 744
DIST_PNAME = 1
DIST_PID_PTYPE = 1
DIST_PID_ID = 3
DIST_PID_PTYPE_ID = 3
DIST_PID_ID_PDATE = 2232
|
Last edited by db2user24; 09-22-10 at 01:57.
|

09-22-10, 03:10
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
If there are only 2232 rows in the table, DB2 probably will not use any index even if you create one. DB2 can scan all the rows of the table faster than it can look at an index to find the matching RID's and then go to the table to retrieve the data with only 2232 rows.
Assuming the table will be much larger than 2232 rows, you will probably want to have the PDATE column as the first column of one of the indexes (along with the other columns that make each row unique). I would make this a unique index. So the index columns would be like this in order of highest cardinality (PDATE, ID, PID, PTYPE).
It is hard to say about any other indexes since I am not sure that the data you provided and tested against is going to be representative of the real data in the table when fully populated.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

09-22-10, 13:12
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 248
|
|
Thanks Marcus! I went ahead and set one index with the columns - PDATE, ID, PID, PTYPE
Hopefully it will help with faster retrieval when there are a large number of records.
I'm now wondering about another table and whether I need an index on it.. here's the info on table two --
column names :
PID
PTYPE
PDATE
VAL
PERIODTYPE
UMSR
count rows can be pretty big.. the example that I ran returned 1210 but again with a different date range, it will definitely be higher.
The select query in the procedure does a look up of PID, PDATE, and VAL based on the WHERE clause PERIODTTYPE = AND UMSR = with a GROUP BY of PID, PDATE
DIST_PID = 1
DIST_PTYPE = 1
DIST_PDATE = 121
DIST_PERIODTYPE = 3
DIST_UMSR = 2
DIST_PERIODTYPE_UMSR = 6
DIST_PERIODTYPE_UMSR_PDATE = 726
Any thoughts on whether an index is even needed on this table?
|
|

09-22-10, 13:49
|
|
:-)
|
|
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
|
|
Why don't you use the db2advis tool to get index recommendations? It's there exactly for that purpose.
|
|

09-22-10, 16:12
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
Originally Posted by db2user24
...count rows can be pretty big.. the example that I ran returned 1210 but again with a different date range, it will definitely be higher.
|
I am not sure you understood my previous comments. You must provide (or must estimate) statistics for the entire table, not just a particular date range. If you have not loaded all date ranges yet, then you will have to estimate, or defer the question of indexes until you do load all the data. I would definitely define a unique index on the column (or set of columns taken together) that are unique, putting the column with the most unique values first.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|