Results 1 to 9 of 9
  1. #1
    Join Date
    Oct 2004
    Location
    Paris, FRANCE
    Posts
    132

    Post Unanswered: JOINs Optimisation

    I have this kind of simple query that I'd like to optimise:
    Code:
    SELECT tup.session_id
      FROM tutor_planner AS tup
      JOIN center_master AS cem ON (cem.center_id = tup.center_id)
     WHERE TRUE
       AND tup.user_id = 914001
       AND tup.date > "2009-01-01"
    ;
    - tutor_planner ~ 500 000 rows
    - center_master ~ 300 rows
    - tup.session is the primary key of tutor_planner
    - cem.center_id, tup.user_id and tup.date are indexed

    This query takes 15s. to retreive 600 rows
    If I remove the center_master JOIN it will take 0.3s. to retrieve the 600 rows.

    Here is the EXPLAIN
    Code:
    +----+-------------+-------+-------+-------------------+---------+----------+------------------+------+-------------+
    | id | select_type | table | type  | possible_keys     | key     | key_len  | ref              | rows | Extra       |
    +----+-------------+-------+-------+-------------------+---------+----------+------------------+------+-------------+
    | 1  | SIMPLE      | cem   | index | PRIMARY           | PRIMARY | 4        |                  | 294  | Using index |
    | 1  | SIMPLE      | tup   | ref   | Tutor,Center,Date | Center  | 5        | v8.cem.center_id | 18   | Using where |
    +----+-------------+-------+-------+-------------------+---------+----------+------------------+------+-------------+
    Why doesn't the optimizer begin by tup ?

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    You should really give us the full index definition and preferably the table definition as well (ie the create statements). Why do you have "where true and ..." ? If the index on tup starts with date and then has user_id then it won't be used in this case. Date is a poor choice of field name as it can easily confuse with keywords.

  3. #3
    Join Date
    Oct 2004
    Location
    Paris, FRANCE
    Posts
    132
    Quote Originally Posted by mike_bike_kite
    You should really give us the full index definition and preferably the table definition as well (ie the create statements).
    Sorry, here is it
    Code:
    CREATE TABLE `tutor_planner` (
      `plan_id` varchar(32) NOT NULL DEFAULT '0',
      `planner_status_id` int(11) NOT NULL DEFAULT '0' COMMENT '1->Planned|2->Cancelled|3->Done',
      `session_id` varchar(32) NOT NULL DEFAULT '0',
      `user_id` varchar(32) DEFAULT NULL,
      `center_id` int(11) DEFAULT NULL,
      `product_id` int(11) NOT NULL DEFAULT '0',
      `date` date NOT NULL DEFAULT '0000-00-00',
      `start_time` time NOT NULL DEFAULT '00:00:00',
      `end_time` time NOT NULL DEFAULT '00:00:00',
      `valid_date` date DEFAULT NULL COMMENT 'This field of Validation will discard inconsistencies',
      `valid_who` varchar(32) DEFAULT NULL COMMENT 'Who validated the session',
      `valid_reason` varchar(100) DEFAULT NULL COMMENT 'The reason for validation',
      `day` int(11) DEFAULT '0',
      `month` int(11) DEFAULT '0',
      `year` int(11) DEFAULT '0',
      `start_hours` int(11) DEFAULT '0',
      `start_minutes` int(11) DEFAULT '0',
      `end_hours` int(11) DEFAULT '0',
      `end_minutes` int(11) DEFAULT '0',
      `reason_for_cancellation` varchar(255) NOT NULL DEFAULT '',
      `workgroup_id` int(11) DEFAULT '0',
      `gp_name` varchar(20) DEFAULT NULL,
      `reassign` int(1) DEFAULT '0',
      `write_who` varchar(32) DEFAULT NULL,
      `write_when` datetime DEFAULT NULL,
      `animation` varchar(50) DEFAULT NULL,
      `usefullness` varchar(50) DEFAULT NULL,
      `session_desc` text NOT NULL,
      PRIMARY KEY (`session_id`),
      KEY `Status` (`planner_status_id`),
      KEY `Tutor` (`user_id`),
      KEY `Center` (`center_id`),
      KEY `Date` (`date`),
      KEY `Product` (`product_id`),
      KEY `Who` (`write_who`),
      KEY `VWho` (`valid_who`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='This table contains all the planner details for a tutor. Thi'
    ;
    
    CREATE TABLE `center_master` (
      `center_id` int(11) NOT NULL AUTO_INCREMENT,
      `center_name` varchar(100) NOT NULL DEFAULT '',
      `center_add_1` varchar(100) NOT NULL DEFAULT '',
      `center_add_2` varchar(100) DEFAULT NULL,
      `center_city` varchar(50) NOT NULL DEFAULT '',
      `center_country` varchar(50) NOT NULL DEFAULT '',
      `center_postcode` varchar(10) NOT NULL DEFAULT '',
      `center_phone` varchar(50) NOT NULL DEFAULT '',
      `center_phone1` varchar(50) NOT NULL DEFAULT '',
      `center_fax` varchar(50) DEFAULT NULL,
      `center_direction_details` varchar(255) DEFAULT NULL,
      `center_human_resource_manager_name` varchar(50) DEFAULT NULL,
      `center_human_resource_manager_contact` varchar(50) DEFAULT NULL,
      `center_emailid` varchar(50) NOT NULL DEFAULT '',
      ` center_compte_ana` varchar(100) NOT NULL DEFAULT '',
      `end_date` date DEFAULT NULL,
      ` center_contract_hours` int(11) DEFAULT NULL,
      `deleted` tinyint(4) NOT NULL DEFAULT '0',
      ` is_online` int(1) NOT NULL DEFAULT '1',
      ` offline_password` varchar(50) DEFAULT NULL,
      `regions_id` tinyint(4) DEFAULT NULL,
      `display_requests` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'Used in the dashboard to show the learner requests',
      `freezed_date` date DEFAULT NULL,
      `freezed_done_who` varchar(32) DEFAULT NULL COMMENT 'Who did freeze this center',
      `freezed_done_date` date DEFAULT NULL COMMENT 'When did this center be freezed',
      `svalid_date` date DEFAULT NULL,
      `svalid_done_who` varchar(32) DEFAULT NULL,
      `svalid_done_date` datetime DEFAULT NULL,
      `emergency` text,
      ` learner_entered` date DEFAULT NULL,
      ` have_rates` varchar(30) DEFAULT NULL,
      `comment` text,
      `tm_name` varchar(100) DEFAULT NULL,
      `tm_phone` varchar(50) DEFAULT NULL,
      `tm_email` varchar(50) DEFAULT NULL,
      `learner_DIY_status` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT '1 -> enable DIY',
      `default_lunch_time` time NOT NULL DEFAULT '13:00:00',
      `center_report_separator` char(1) DEFAULT NULL,
      `sign_valid_mode` int(1) DEFAULT '1' COMMENT '1 -> Week | 2-> Month',
      `snapshot_access` tinyint(1) unsigned NOT NULL DEFAULT '0',
      `signature_check` tinyint(1) unsigned NOT NULL DEFAULT '0',
      `center_doc` varchar(255) DEFAULT NULL,
      ` place` text NOT NULL,
      ` contact_me` tinyint(1) NOT NULL DEFAULT '0',
      ` center_need` decimal(10,1) DEFAULT NULL,
      `center_room` text NOT NULL,
      `by_metro_train` text COMMENT 'This fiels is for center1',
      `by_car` text COMMENT 'This fiels is for center1',
      `by_bus` text COMMENT 'This fiels is for center1',
      `contact_person` text COMMENT 'This fiels is for center1',
      `reception` text COMMENT 'This fiels is for center1',
      `get_to_center` text COMMENT 'This fiels is for center1',
      `center_key` text COMMENT 'This fiels is for center1',
      `computer` text COMMENT 'This fiels is for center1',
      `learner_files` text COMMENT 'This fiels is for center1',
      `photocopy` text COMMENT 'This fiels is for center1',
      `lunch` text COMMENT 'This fiels is for center1',
      `toilette_bevrages` text COMMENT 'This fiels is for center1',
      `end_of_day` text COMMENT 'This fiels is for center1',
      `misc` text COMMENT 'This fiels is for center1',
      `problems_getting_there` text COMMENT 'This fiels is for center1',
      `computer_hotline` text COMMENT 'This fiels is for center1',
      `center_tm` text COMMENT 'This fiels is for center1',
      `center_resources` text COMMENT 'This fiels is for center1',
      `update_who` varchar(32) DEFAULT NULL COMMENT 'This fiels is for center1',
      `update_when` datetime DEFAULT NULL COMMENT 'This fiels is for center1',
      `specifics_details` text COMMENT 'This fiels is for center1',
      `signature_sheets` text,
      `activity_recap` text,
      `vcr_access` tinyint(2) DEFAULT '1' COMMENT '0 => InActive | 1 => Active',
      `pivot_access` tinyint(2) DEFAULT '0' COMMENT '0 => Cant assign pivot | 1=> Can assign pivot',
      PRIMARY KEY (`center_id`),
      KEY `Region` (`regions_id`),
      KEY `CenterName` (`center_name`),
      KEY `freezed_done_who` (`freezed_done_who`),
      KEY `svalid_done_who` (`svalid_done_who`),
      KEY `center_city` (`center_city`),
      KEY `center_compte_ana` (` center_compte_ana`)
    ) ENGINE=MyISAM AUTO_INCREMENT=490 DEFAULT CHARSET=utf8 COMMENT='The center_master table contains all the information of vari';
    (Don't worry, the fields which starts by a space have been deactivated to see if it creates any errors in our system)

    Quote Originally Posted by mike_bike_kite
    Why do you have "where true and ..." ?
    For maintenance purpose.
    It allows me to toggle easily whatever WHERE statement.

    Quote Originally Posted by mike_bike_kite
    If the index on tup starts with date and then has user_id then it won't be used in this case.
    I don't understand this

    Quote Originally Posted by mike_bike_kite
    Date is a poor choice of field name as it can easily confuse with keywords.
    I agree with you

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Could you try the following SQL and say if it runs any faster
    Code:
    SELECT tup.session_id
    FROM tutor_planner AS tup, 
          center_master AS cem
    WHERE tup.user_id = "914001"
       AND tup.date > "2009-01-01"
       AND cem.center_id = tup.center_id

  5. #5
    Join Date
    Oct 2004
    Location
    Paris, FRANCE
    Posts
    132
    It's not faster.
    (I removed the "" arround user_id at the beginning to avoid confusing anybody)
    The EXPLAIN is the same

  6. #6
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by gtk
    It's not faster.
    Just some comments then:
    • I saw the user_id was a varchar so assumed that when you were comparing it to an int value it would not use the index on that field.
    • It won't use the index on date because it probably figures that most of the dates are after "2009-01-01".
    • Are there many different user_id's in the table - just a long shot but if many of the records have this id then the optimiser might figure that restricting on this field isn't worth while.
    • All your indexes are on single fields and not on combinations of fields which means the optimiser has to pick just one field to use - this is difficult.

    Personally I'd change user_id into an int field. Then I'd have an index on user_id and date combined. You could try this out by just selecting the data from the existing table into the new table and creating the index. I'd also get rid of most of your current indexes as they look unlikely to be used.

  7. #7
    Join Date
    Oct 2004
    Location
    Paris, FRANCE
    Posts
    132
    It's working well with the combinated index !!!
    Thanks Mike.

    I know that it's better to have the user_id in int.
    We have a legacy issue which prevents us from changing the format easily.
    (the varchar was used mainly for synchronization between multiple offline databases for client security reasons (the pentagon... kidding))
    Quote Originally Posted by mike
    It won't use the index on date because it probably figures that most of the dates are after "2009-01-01".
    You mean that an index can't be used with a BETWEEN or > < ?
    Quote Originally Posted by mike
    All your indexes are on single fields and not on combinations of fields which means the optimiser has to pick just one field to use - this is difficult.
    In fact we can find all the combinations of search possible in our system.
    Do you think I'd better create only those that are the most used ?

  8. #8
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by gtk
    I know that it's better to have the user_id in int.
    We have a legacy issue which prevents us from changing the format easily.
    (the varchar was used mainly for synchronization between multiple offline databases for client security reasons (the pentagon... kidding))
    It may be better then to use varchars when comparing against your ids that are varchars ie = "914001"

    Quote Originally Posted by gtk
    You mean that an index can't be used with a BETWEEN or > < ?
    The optimisor makes a choice when looking at the indexes - if it feels that it won't be reducing the result set by much then it will ignore that index. When comparing against all the dates since "1 Jan 2009" then it probably feels it's not worth using that index above any other.

    Quote Originally Posted by gtk
    In fact we can find all the combinations of search possible in our system.
    Do you think I'd better create only those that are the most used ?
    This way of creating indexes isn't going to provide you with good results. You are simply swamping the optomisor with too many options and none of them are particularly good choices. Sadly the subject of performance tuning and indexes is quite in depth and it's difficult to cover much in a forum like this.

  9. #9
    Join Date
    Oct 2004
    Location
    Paris, FRANCE
    Posts
    132
    Quote Originally Posted by mike_bike_kite
    The optimisor makes a choice when looking at the indexes - if it feels that it won't be reducing the result set by much then it will ignore that index. When comparing against all the dates since "1 Jan 2009" then it probably feels it's not worth using that index above any other.
    Got the point.

    Quote Originally Posted by mike_bike_kite
    This way of creating indexes isn't going to provide you with good results. You are simply swamping the optomisor with too many options and none of them are particularly good choices. Sadly the subject of performance tuning and indexes is quite in depth and it's difficult to cover much in a forum like this.
    You are right

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •