Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    May 2007
    Posts
    80

    what if there's no index for ORDER BY to use ?

    Got a table with horse races, contains 85000 records, got a primary key.
    Also has a field for Date,Time,Course,Etc... indexed as needed for that table.

    CREATE TABLE `race` (
    `RACE_ID` mediumint(1) unsigned NOT NULL auto_increment,
    `RACEDATE` date default NULL,
    `RACETIME` time default NULL,
    PRIMARY KEY (`RACE_ID`)
    )

    Got a table with Horses, contains 100000 records, got a single primary key.

    CREATE TABLE `horse` (
    `HORSE_ID` mediumint(1) unsigned NOT NULL auto_increment,
    PRIMARY KEY (`HORSE_ID`)
    )

    Got a table with runners, horses participating in a particular race with the primary key being a combination of the RACE_ID and HORSE_ID. This also has a field called position being the finishing position of a horse in that race. Contains 1 million records. ( on average 10 horses in a race )

    CREATE TABLE `racerunner` (
    `RACE_ID` mediumint(1) unsigned NOT NULL,
    `HORSE_ID` mediumint(1) unsigned NOT NULL,
    `Finishing_position` tinyint(3) unsigned default NULL,
    PRIMARY KEY (`RACE_ID`,`HORSE_ID`)
    )

    I want to extract a list of races with their horses ordered by date and position.

    Code:
    SELECT race.race_id,
           race.racedate,
           race.racetime,
           racerunner.Finishing_position,
           racerunner.horse_id
      FROM racerunner
      JOIN race
        ON race.RACE_ID = racerunner.RACE_ID
     ORDER
        BY race.racedate,race.racetime,racerunner.Finishing_position;
    This query without the ORDER BY clause runs in 0.03 seconds,
    with the ORDER BY it takes a full 5 seconds.

    The main problem being that ORDER BY does not have an index it can use.
    I suppose i could set an index on the Finishing_position but all that will do really is split 1 million records in 10 large groups.

    How do handle a situation where there's no index for ORDER BY to use ?
    How do you create one to get the speed ?

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Quote Originally Posted by BettingSherlock
    a field for Date,Time,Course,Etc... indexed as needed for that table.
    Could you please elaborate? As evidenced by your question, your statement may not be completely correct. By the way, indexes are mostly for queries, not for tables...
    ---
    "It does not work" is not a valid problem statement.

  3. #3
    Join Date
    May 2007
    Posts
    80
    I just mean there are a number of additional fields not relevant to this query which i've omitted to keep things focussed. Also have some indexes on that table for other queries but none of them seem to work for this one. At least i can't figure it out.

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Quote Originally Posted by BettingSherlock
    Also have some indexes on that table for other queries but none of them seem to work for this one.
    That's what I wanted to see. How do you know they are not being used.

    The use of indexes depends very much on the particular DBMS engine, which you chose not to disclose for some reason, and general considerations may not be applicable in specific cases. However, I would try an index on race.race_id,race.racedate,race.racetime.
    ---
    "It does not work" is not a valid problem statement.

  5. #5
    Join Date
    Nov 2003
    Posts
    2,935
    Quote Originally Posted by n_i
    depends very much on the particular DBMS engine, which you chose not to disclose for some reason
    Just have a look at the quotes. MySQL is the only database that uses these nasty back-tick quotes.

  6. #6
    Join Date
    May 2007
    Posts
    80
    I use MySQL but i don't consider this a database specific question, more about how to design the database structure and indexes. So I'm probably mistaken there and i do need to look at MySQL specifics.

    Good point about how to know if the other indexes aren't being used. Well MySQL's EXPLAIN command tells me that, but there is of course the possibility that the existing indexes somehow confuse the ORDER BY clause.

    So made a copy and removed all other fields and indexes so it's exactly as posted here. With no indexes, just the primary keys the query is 0.003sec without ORDER BY and 6.3sec with.

    Going to try some indexes now....

  7. #7
    Join Date
    May 2007
    Posts
    80
    RACE table index on RACEDATE,RACETIME
    RACERUNNER table index on Finishing_position

    as those are the ones used with the ORDER BY down to 4.4sec but explain tells me no indexes used for the sorting.

    RACE table index on RACE_ID,RACEDATE,RACETIME
    RACERUNNER table index on RACE_ID,Finishing_position

    added the RACE_ID as it is used in the join but still 4.3sec


    What i'm wondering is since ORDER BY uses fields from 2 different tables is any kind of indexing on each individual table ever going to be used ?
    You can't create an index covering multiple tables can you ?

  8. #8
    Join Date
    May 2007
    Posts
    80
    Just re-read some of the documentation and i should mention i'm on version 5.0.21 , ( was reading 5.1 without noticing and took 30 min to figure out why USE INDEX FOR ORDER BY generated an error )

  9. #9
    Join Date
    May 2007
    Posts
    80
    CREATE TABLE `racerunnerdup` (
    `RACE_ID`
    `HORSE_ID`
    `RACEDATE`
    `RACETIME`
    `Position`
    PRIMARY KEY (`RACE_ID`,`HORSE_ID`),
    KEY `dtp` (`RACEDATE`,`RACETIME`,`Position`)
    ) ENGINE=MyISAM;

    Ok, just to try it out i created this table and filled it with data.
    So i duplicated RACEDATE and RACETIME from the RACE table into the RACERUNNER table so i would have all fields for the ORDER BY in one table and one index.

    Code:
    SELECT r.RACE_ID,
           d.RACEDATE,
           d.RACETIME,
           d.Position,
           d.HORSE_ID
      FROM racerunnerdup as d FORCE INDEX (dtp)
      JOIN race as r
        ON r.RACE_ID = d.RACE_ID
     ORDER
        BY RACEDATE,RACETIME,Position;
    Runs in 0.003sec
    ( but only if the FORCE INDEX is in place ! 3.9 without )

  10. #10
    Join Date
    May 2007
    Posts
    80
    Code:
    SELECT race.race_id,
           race.racedate,
           race.racetime,
           racerunner.position,
           racerunner.horse_id
      FROM racerunner FORCE INDEX (pos)
      JOIN race FORCE INDEX (datetime)
        ON race.RACE_ID = racerunner.RACE_ID
     ORDER
        BY race.racedate,race.racetime,racerunner.position;
    Same trick doesn't work here thou, which would make sense to me as 2 individual indexes don't make a single.

    Not too keen on duplicating data so back to the original question: what to do when ORDER BY uses fields from different tables and therefore can't (can it?) use an index ?

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Create an index on race (racedate, racetime) and another index on racerunner (race_id, position).

    Don't FORCE anything.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  12. #12
    Join Date
    May 2007
    Posts
    80
    Thanks Pat, ok tried that,
    RACE index on (RACEDATE,RACETIME)
    RACERUNNER index on (RACE_ID,Finishing_position)
    no forcing of anything,
    EXPLAIN tells me it uses PRIMARY indexes but nothing more,
    speed of query is 9 sec

    then i added the RACE_ID to the RACE:
    RACE index on (RACE_ID,RACEDATE,RACETIME)
    RACERUNNER index on (RACE_ID,Finishing_position)
    now EXPLAIN tells me it is using an index "Using index; Using temporary; Using filesort"
    unfortunately the query runs in over 8seconds so not much help,
    ( i suspect the index is only partially used by ORDER BY )

    I really appreciate the replies so far, it's a situation which happens quit often and as no-one has yet replied "that's easy, you just ... " i take it that what you do isn't something obvious, nor generic. I suppose most of the time the amount of data is so small most people don't bother looking into it. But the time different is huge so it pay's of to get it right.

  13. #13
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Which MySQL engine are these tables running? If both are Inno-DB or both are MyISAM, then the indices I suggested should find and use these indexes to produce results without a sort. I didn't try using your exact 5.0.13 release, or a mix of Inno-DB and MyISAM (which is often squirrely in performance for me).

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  14. #14
    Join Date
    May 2007
    Posts
    80
    all the same MyISAM.

  15. #15
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    This is pure "air code" I haven't tested it, but it ought to do what you want "out of the box" even though I've marked two possible changes in the final SELECT that might be helpful if it doesn't work as-is.
    Code:
    CREATE TABLE `race` (
       `RACE_ID` 		mediumint(1) unsigned NOT 	NULL auto_increment
    ,  `RACEDATE` 		date default 			NULL
    ,  `RACETIME` 		time default 			NULL
    ,  PRIMARY KEY (`RACE_ID`)
    );
    
    CREATE TABLE `horse` (
       `HORSE_ID` mediumint(1) unsigned NOT NULL auto_increment
     ,  PRIMARY KEY (`HORSE_ID`)
    );
    
    CREATE TABLE `racerunner` (
       `RACE_ID` 		mediumint(1) unsigned		NOT NULL
    ,  `HORSE_ID` 		mediumint(1) unsigned		NOT NULL
    ,  `Finishing_position` tinyint(3) unsigned default 	NULL
    ,  PRIMARY KEY (`RACE_ID`,`HORSE_ID`)
    );
    
    CREATE INDEX XIE01race ON race (RACEDATE, RACETIME, RACE_ID);
    CREATE INDEX XIE01racerunner ON racerunner (RACE_ID, Finishing_position);
    
    SELECT race.race_id,
           race.racedate,
           race.racetime,
           racerunner.position,
           racerunner.horse_id
      FROM racerunner 	-- FORCE INDEX (XIE01race)		-- Second choice
      JOIN race 		-- FORCE INDEX (XIE01racerunner)		-- First choice
        ON race.RACE_ID = racerunner.RACE_ID
     ORDER
        BY race.racedate,race.racetime,racerunner.position;
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •