If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > optimising a query, how to ?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 08-31-09, 16:05
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
optimising a query, how to ?

I'd like to know if there's a way to speed up this query:

Code:
select
RACEDATE,races.name,country, RACETIME,RACEORDER,RACESTATUS, races.handicap,races. runnersNo,races.places, races.title,wintime, position,draw,horse.name,horse.registration_country, jockey.lastname,trainer.lastname
from
(select
race.race_id, RACEDATE,RACETIME,RACEORDER,RACESTATUS, course.name,if(sbs_regions_and_countries.Englishname = "Ireland","(IRE)","(GB)") as country, race.title,wintime, handicap,runnersNo,places
FROM race
left join course on course.course_id = race.course_id left join course_country on course_country.course_id = course.course_id left join sbs_regions_and_countries on sbs_regions_and_countries.id = course_country.country_id left join race_derived on race_derived.race_id = race.race_id
where
/* racedate = UTC_DATE() */ racedate = "2006-06-10" /* ( racedate >= "2006-06-01" and racedate <= "2006-06-31") */ /* ( racedate >= "2007-01-01" and racedate <= "2007-12-31") */ and ( sbs_regions_and_countries.englishname = "England" or sbs_regions_and_countries.Englishname = "Ireland" )
) as races
join racerunner on racerunner.race_id = races.race_id left join horse on racerunner.horse_id = horse.horse_id left join trainer on racerunner.trainer_id = trainer.trainer_id left join jockey on racerunner.jockey_id = jockey.jockey_id
order by racedate,racetime,raceorder,position,draw;

We're talking horse racing here.
I want a list of races with runners for a timeperiod and limited to countries "england" and "ireland". This is what i do in the inner select where i join the country data to the course of the races for the timeperiod. A "race" is always run on a "course" (or track) which is in a "country".

This way i have a sub-selection of races BEFORE i join in the horses.

Each "race" has any number of horses, called "runners" and each one has a "jockey" and "trainer" associated with it, that's the outer select.


When i run this query on a full year then i get about 9000 races with 100000 runners and the query runs in 30 to 40 seconds. Fair enough for retireving and sorting 100000 records. But just wondering if i can get it faster.

The inner select which selects the sub-set of races in itself runs in 0.009s so not much to gain there i'd say.

If i remove the "order by" part then the whole query runs in 3 seconds, so obviously that is where there's time to gain. I just don't know how ?
Reply With Quote
  #2 (permalink)  
Old 08-31-09, 16:14
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
Actually it's more of a general question as a learning opportunity rather than about this query specifically.

How do i post the result of the EXPLAIN command ?
Can't post HTML on this forum so a copy&paste gets all screwed up ?
Reply With Quote
  #3 (permalink)  
Old 08-31-09, 16:31
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
Ran the explain command and attached the result as an image here.
Attached Thumbnails
optimising a query, how to ?-explain.jpg  
Reply With Quote
  #4 (permalink)  
Old 09-02-09, 16:43
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
code
select RACEDATE
,course.name
,if(sbs_regions_and_countries.Englishname = "Ireland","(IRE)","(GB)") as country
,RACETIME
,RACEORDER
,RACESTATUS
,handicap
,runnersNo
,places
,title
,wintime
,position
,draw
,horse.name
,horse.registration_country
,jockey.lastname
,trainer.lastname
FROM race
left join course on course.course_id = race.course_id
left join course_country on course_country.course_id = course.course_id
inner join sbs_regions_and_countries on sbs_regions_and_countries.id = course_country.country_id
and
sbs_regions_and_countries.englishname in ( "England","Ireland")
left join race_derived on race_derived.race_id = race.race_id
inner join racerunner on racerunner.race_id = race.race_id
left join horse on racerunner.horse_id = horse.horse_id
left join trainer on racerunner.trainer_id = trainer.trainer_id
left join jockey on racerunner.jockey_id = jockey.jockey_id
where/* racedate = UTC_DATE() */
racedate = "2006-06-10"
/* ( racedate >= "2006-06-01" and racedate <= "2006-06-31") */
/* ( racedate >= "2007-01-01" and racedate <= "2007-12-31") */

order by racedate
,racetime
,raceorder
,position
,draw;
/code
Reply With Quote
  #5 (permalink)  
Old 09-02-09, 16:45
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
i rearranged things a bit, as the nested expression wasn't needed, no summation/grouping/min/max there. Also, changed join to sbs_countries to an inner join as it was a waste to join to all the tables, then throw out conditions in the where clause. Probably still some issue with the sorting, just check sort heap sizes to ensure enough room for amount of data being sorted.

Dave
Reply With Quote
  #6 (permalink)  
Old 09-02-09, 18:58
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
thanks for the reply,
tried that but still around 30 seconds

You've actually got close to what i started with.
Here's the problem, the database contains 140000 races with an average of 10 runners per race. The way you've done it is that you join all of those first, before the where clause, giving 1.4 million records, then make a selection from those millions. That takes time.

That's why i got the inner select so from the 140000 races i first select the 10000 i need, using existing indexes so pretty fast, and then join them with the racerunners which is a lot less records to join.

The dis-advantage here is that by using the inner select i save time by joining less records but i lose the ability to use an index for the ordering.
Reply With Quote
  #7 (permalink)  
Old 09-02-09, 19:07
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
Code:
inner join sbs_regions_and_countries on sbs_regions_and_countries.id = course_country.country_id
and
sbs_regions_and_countries.englishname in ( "England","Ireland")
this would appear to be a bit faster than having it in the WHERE so taken that on board. (note to self: re-read documentation on inner join )
Reply With Quote
  #8 (permalink)  
Old 09-02-09, 19:42
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
I've gone a slightly different direction. I need the selection of races for a number of queries so what i've done is created a procedure and i've put the inner selection of races into a temporary table in memory, with an index.

Then i use that to join the other tables with. Testing it with a myisam table rather than in memory has shaved 5 seconds off so down to 25 seconds.

Without order by it's now 0.7 seconds and with 25 seconds so still thinking there should be something i can do to speed that up.
Reply With Quote
  #9 (permalink)  
Old 09-02-09, 20:07
BettingSherlock BettingSherlock is offline
Registered User
 
Join Date: May 2007
Posts: 78
WAHEEEEY down to 1.5 seconds

Step 1, using a temporary table for a selection of races,
Step 2, using a second temporary table for a selection of those races with their runners,
including all fields required for ordering in an index,
Step 3, join all other tables and select with order by

put the whole thing in a procedure and it runs in 1.5 seconds.

Code:
DELIMITER $$

DROP PROCEDURE IF EXISTS `sbs_racingdb`.`testit` $$
CREATE PROCEDURE `testit`()
BEGIN

/* temporary table to hold sub-selection of races */
CREATE TEMPORARY TABLE IF NOT EXISTS `temporary_races` (
  `RACE_ID` mediumint(1) unsigned NOT NULL,
  PRIMARY KEY (`RACE_ID`)
) ENGINE=MEMORY;

/* temporary table to hold races and runners,
   including all fields needed for ordering with an index */
CREATE TEMPORARY TABLE IF NOT EXISTS `temporary_runners` (
  `RACE_ID` mediumint(1) unsigned NOT NULL,
  `RACEDATE` date default NULL,
  `RACETIME` time default NULL,
  `RACEORDER` tinyint(2) default NULL,
  `HORSE_ID` mediumint(1) unsigned NOT NULL,
  `position` mediumint(1) unsigned NOT NULL,
  `draw` mediumint(1) unsigned NOT NULL,
  `jockey_ID` mediumint(1) unsigned NOT NULL,
  `trainer_ID` mediumint(1) unsigned NOT NULL,
  PRIMARY KEY (`RACE_ID`,`HORSE_ID`),
  key (`RACEDATE`,`RACETIME`,`RACEORDER`,`position`,`draw`)
) ENGINE=MEMORY;


/* select needed races */
insert into temporary_races
select
    race.race_id
  FROM race
    left join course on course.course_id = race.course_id
    left join course_country on course_country.course_id = course.course_id
    inner join sbs_regions_and_countries
      on  sbs_regions_and_countries.id = course_country.country_id
      and sbs_regions_and_countries.englishname in ( "England","Ireland")
  where
    /* racedate = UTC_DATE() */
    /* ( racedate >= "2006-06-10" and racedate <= "2006-06-10") */
    /* ( racedate >= "2006-06-01" and racedate <= "2006-06-31") */
    ( racedate >= "2007-01-01" and racedate <= "2007-12-31")
;

/* for those races select all runners and fields needed for ordering */
insert into temporary_runners
  select
    race.race_id,RACEDATE,RACETIME,RACEORDER,horse_id,position,draw,jockey_id,trainer_id
  from
    temporary_races
    left join race on race.race_id = temporary_races.race_id
    inner join racerunner on racerunner.race_id = temporary_races.race_id
;

/* bring it all together */
select
    temporary_runners.RACEDATE,course.name as coursename,
    temporary_runners.RACETIME,temporary_runners.RACEORDER,RACESTATUS,
    handicap,runnersNo,places,
    race.title,wintime,
    position,draw,horse.name as horsename,horse.registration_country as horsecountry,
    jockey.lastname as jockeyname,trainer.lastname as trainername
  from
    temporary_runners
    left join race on race.race_id = temporary_runners.race_id
    left join course on course.course_id = race.course_id
    left join race_derived on race_derived.race_id = temporary_runners.race_id
    left join horse   on temporary_runners.horse_id = horse.horse_id
    left join trainer on temporary_runners.trainer_id = trainer.trainer_id
    left join jockey  on temporary_runners.jockey_id = jockey.jockey_id
  order by racedate,racetime,raceorder,position,draw
;
/* return resultset */

END $$

DELIMITER ;
But now i'm thinking i may be doing this the long way around and there's something simpler possible
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On