Results 1 to 12 of 12
  1. #1
    Join Date
    Nov 2007
    Posts
    265

    Unanswered: Two indexes on the same table?

    Hi,

    I have a table called TEST with these columns --

    PID -- integer
    PTYPE -- varchar
    ID -- integer
    PDATE -- TIMESTAMP
    PNAME -- varchar
    VAL -- double

    I have a couple of different procedures that access data from this table --

    In one of the procedures, the select query on table TEST is based on .. WHERE pid= AND ptype= AND id =

    And, in a few others, the select query on table TEST is based on .. WHERE pid= AND id = AND pdate =


    Considering the above, would it make sense to create two indexes on table TEST? Thanks!

    CREATE INDEX test1_idx ON TEST (PID, PTYPE, ID)

    CREATE INDEX test2_idx ON TEST (PID, ID, PDATE)

  2. #2
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    db2user24, Since all of the Where operands are =, you really have a choice between

    PID, ID, PTYPE
    and/or
    PID, ID, PDATE

    The answer depends on the cardinality of the columns.

    If PID and ID filter down to very few rows, then you probably don't need PTYPE or PDATE.

    If PID and ID still returns a LOT of rows and the extra filtering of PTYPE and/or PDATE returns only a few rows, they either one (or both) may be better.

    Also what (if anything) is your Unique key? If, for example, PID is unique, then you really don't need to index any other column, since PID is in the Where clause with an = (it would return only 1 row by itself).

    With nothing else to go on, I would suggest creating one index at a time and run both queries with explain. See if the access path and run time significantly change. (This is assuming the table has been populated with data).

    There are other reasons to create an index. For example, by adding some columns in a particular order, you might avoid a sort.

  3. #3
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Please provide the number of rows in the table, and the number of unique values for each column in your proposed indexes:

    select count(*) from table-name; -- to count the rows

    Select count(distict PTYPE) from table-name; -- to count the number of unique values in PTYPE
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  4. #4
    Join Date
    Nov 2007
    Posts
    265
    thanks for the responses! the number of rows in the table is always different.. really depends on the date range the user gives ---
    Last edited by db2user24; 09-22-10 at 02:46.

  5. #5
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    I don't mean to be rude, but if you want an accurate answer, please supply the information requested. It doesn't matter what the query looks like, it matters what the data in the table looks like in terms of the number of unique occurances (all of the data).

    If the data is not fully populated yet, you will have to guess on the numbers.

    Also, is there a primary key on this table?
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  6. #6
    Join Date
    Nov 2007
    Posts
    265
    There is no primary key on the table but the PID, PTYPE, ID, PDATE combination is always unique.
    Again, the count of records varies on the date range and interval the user selects.

    Example of the PID, PTYPE, ID, PDATE combination ---

    100, Site, 100, 1/1/2010 02:00
    100, Site, 100, 1/1/2010 03:00
    100, Site, 100, 1/1/2010 04:00
    100, Site, 100, 1/1/2010 05:00
    100, Site, 80, 1/1/2010 02:00
    100, Site, 80, 1/1/2010 03:00
    100, Site, 80, 1/1/2010 04:00
    100, Site, 80, 1/1/2010 05:00
    Last edited by db2user24; 09-22-10 at 02:58.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    db2user24,

    I think that it will be enough to show the result of the following query, to answer the Marcus_A's question.
    You don't need to edit the result nor to execute other query, if your data was fully populated.

    Code:
    SELECT COUNT(*)              AS count_rows
         , COUNT(DISTINCT PID)   AS dist_pid
         , COUNT(DISTINCT PTYPE) AS dist_ptype
         , COUNT(DISTINCT ID)    AS dist_id
         , COUNT(DISTINCT PDATE) AS dist_pdate
         , COUNT(DISTINCT PNAME) AS dist_pname
         , COUNT(DISTINCT DIGITS(PID) || PTYPE )      AS dist_pid_ptype
         , COUNT(DISTINCT DIGITS(PID) || DIGITS(ID) ) AS dist_pid_id
         , COUNT(DISTINCT DIGITS(PID) || PTYPE || DIGITS(ID) )       AS dist_pid_ptype_id
         , COUNT(DISTINCT DIGITS(PID) || DIGITS(ID) || CHAR(PDATE) ) AS dist_pid_id_pdate
     FROM  table-name
    ;
    Last edited by tonkuma; 09-22-10 at 02:24.

  8. #8
    Join Date
    Nov 2007
    Posts
    265
    thanks, i just ran an example and this is the result --

    total count of rows in table = 2232

    Distinct ( PTYPE ) = 1


    Also, here are the results for the above query :

    COUNT_ROWS DIST_PID DIST_PTYPE DIST_ID DIST_PDATE DIST_PNAME DIST_PID_PTYPE DIST_PID_ID DIST_PID_PTYPE_ID DIST_PID_ID_PDATE
    ----------- ----------- ----------- ----------- ----------- ----------- -------------- ----------- ----------------- -----------------
    2232 1 1 3 744 1 1 3 3 2232

    1 record(s) selected.


    COUNT_ROWS = 2232
    DIST_PID = 1
    DIST_PTYPE = 1
    DIST_ID = 3
    DIST_PDATE = 744
    DIST_PNAME = 1
    DIST_PID_PTYPE = 1
    DIST_PID_ID = 3
    DIST_PID_PTYPE_ID = 3
    DIST_PID_ID_PDATE = 2232
    Last edited by db2user24; 09-22-10 at 02:57.

  9. #9
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    If there are only 2232 rows in the table, DB2 probably will not use any index even if you create one. DB2 can scan all the rows of the table faster than it can look at an index to find the matching RID's and then go to the table to retrieve the data with only 2232 rows.

    Assuming the table will be much larger than 2232 rows, you will probably want to have the PDATE column as the first column of one of the indexes (along with the other columns that make each row unique). I would make this a unique index. So the index columns would be like this in order of highest cardinality (PDATE, ID, PID, PTYPE).

    It is hard to say about any other indexes since I am not sure that the data you provided and tested against is going to be representative of the real data in the table when fully populated.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  10. #10
    Join Date
    Nov 2007
    Posts
    265
    Thanks Marcus! I went ahead and set one index with the columns - PDATE, ID, PID, PTYPE

    Hopefully it will help with faster retrieval when there are a large number of records.


    I'm now wondering about another table and whether I need an index on it.. here's the info on table two --

    column names :

    PID
    PTYPE
    PDATE
    VAL
    PERIODTYPE
    UMSR

    count rows can be pretty big.. the example that I ran returned 1210 but again with a different date range, it will definitely be higher.

    The select query in the procedure does a look up of PID, PDATE, and VAL based on the WHERE clause PERIODTTYPE = AND UMSR = with a GROUP BY of PID, PDATE

    DIST_PID = 1
    DIST_PTYPE = 1
    DIST_PDATE = 121
    DIST_PERIODTYPE = 3
    DIST_UMSR = 2
    DIST_PERIODTYPE_UMSR = 6
    DIST_PERIODTYPE_UMSR_PDATE = 726


    Any thoughts on whether an index is even needed on this table?

  11. #11
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Why don't you use the db2advis tool to get index recommendations? It's there exactly for that purpose.

  12. #12
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Quote Originally Posted by db2user24 View Post
    ...count rows can be pretty big.. the example that I ran returned 1210 but again with a different date range, it will definitely be higher.
    I am not sure you understood my previous comments. You must provide (or must estimate) statistics for the entire table, not just a particular date range. If you have not loaded all date ranges yet, then you will have to estimate, or defer the question of indexes until you do load all the data. I would definitely define a unique index on the column (or set of columns taken together) that are unique, putting the column with the most unique values first.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •