| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |
|

10-03-09, 12:03
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
|
INDEX size threshold
|
|
Hi guys,
I have tables in my database that, I noticed recently,
don't use some evident indexes.
So I tried by copying smaller tables which uses the indexes well.
I have the impression that when the table surpasses a certain size,
it doesn't use certain indexes anymore.
Is there a way to set a threshold anywhere ?
|
|

10-07-09, 03:20
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Do you know what I mean ?
A simple
Code:
SELECT * FROM table WHERE field = "x"
Does not use the index if table is larger than a certain size.
How to set the threshold ?
|
|

10-07-09, 03:52
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
|
|
Often, when tables are quite small, the optimiser decides it's more costly to go to the index pages and from there to the data pages so it just jumps straight into the data pages without using the indexes. Other times it might decide that because the values you are indexing aren't very varied ie if most the fields = "x" then it's not worth using that index. If you start setting thresholds yourself then you will almost certainly do a worse job than the existing optimiser.
Performance tuning is quite an in-depth subject and can't be covered in a simple post (or thread) so the above is just a small taster. Are you actually having performance issues in which case it might be worth giving us your SQL and the table definition and we can work on that.
|
|

10-10-09, 03:43
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Actually I have performance issues.
Here is attached one of my tables (38 fields) that doesn't use certain indexes.
Example:
Code:
EXPLAIN SELECT * FROM convention WHERE invoicing = 1
gives
Code:
+----+-------------+------------+-----+---------------+-----+---------+-----+------+-------------+
| id | select_type | table |type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-----+---------------+-----+---------+-----+------+-------------+
| 1 | SIMPLE | convention | ALL | Invoicing | | | | 798 | Using where |
+----+-------------+------------+-----+---------------+-----+---------+-----+------+-------------+
Code:
EXPLAIN SELECT * FROM convention WHERE invoicing = 0
gives
Code:
+----+-------------+------------+-----+---------------+-----------+---------+-------+------+-------+
| id | select_type | table |type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-----+---------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | convention | ref | Invoicing | Invoicing | 1 | const | 76 | |
+----+-------------+------------+-----+---------------+-----------+---------+-------+------+-------+
|
|

10-10-09, 05:10
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
No it won't use the index because the field invoicing only takes two values (0 and 1). This means that if the optimiser loads these index pages then at best it will cut down it's search by 50% - it probably figures that it's not worth the expense (in time) of loading these index pages and iterating between the index pages and the data pages each time.
I can't see why you're having performance issues pulling data from a table that only contains tables 800 rows. This should be almost instant even if it was doing a table scan. Is there more SQL involved?
|
|

10-10-09, 05:55
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
I use this table to filter a query with multiple
tables JOINed searching in an average of 50k lines each.
If you look at my EXPLAINs you will see that the index
is used with 'invoicing = 0', not for 'invoicing = 1'.
This is affecting the execution time by about 50x.
|
|

10-10-09, 06:03
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
How many records have invoicing=0 and how many =1?
If only a small percentage are =0 then the index will be used.
The cut off point is usually around 10%
Can we see all the SQL?
|
|

10-10-09, 06:26
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
As in the EXPLAINs:
798 rows with invoice = 1 (index not used)
76 rows with invoice = 0 (index used)
|
|

10-10-09, 15:02
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
total number of rows = 798 + 76 = 874
Num of rows where invoice=0 =76 =~ 9% which is under 10% so index used
Num of rows where invoice=1 =798 =~ 91% which is over 10% so index not used
The 10% cut off is my estimate at what MySQL uses - systems vary.
Mike
|
|

10-21-09, 04:51
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Isn't it when we have to get a lot of rows that MySQL should use the indexes ?
You said before
Quote:
|
Originally Posted by mike_bike_kite
[...]when tables are quite small, the optimiser decides it's more costly to go to the index pages and from there to the data pages so it just jumps straight into the data pages without using the indexes[...]
|
I have the impression that it does the opposite, doesn't it ?
|
|

10-21-09, 05:18
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
I don't think I can explain things any more clearly - sorry. Perhaps someone else can try? Alternatively why don't you just give us the SQL that's running slowly, tell us how long it's currently taking and we'll have a go at rewriting it for you.
|
|

10-22-09, 04:09
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Rudy, what do you think ?
Here is one of my query
Code:
SELECT
tup.session_id AS "session_id"
, sest.session_student_id AS "sest_id"
, usm.user_id AS "user_id"
, tut.user_id AS "tutor_id"
, UCASE(usm.user_family_name) AS "nom"
, usm.user_emp_no AS "emp_no"
, usm.matricule AS "matricule"
, tup.date AS "date"
, cem.center_name AS "centre"
, IF(tup.date <= cem.freezed_date, "validé", "") AS "validation"
, cem.center_id AS "centre_id"
, prm.product_name AS "produit"
, lam.language AS "langue"
, prc.name AS "cat"
, /* pri.cost */ 0 AS "pu"
, (CASE sest.attendance WHEN "P" THEN "Present" WHEN "A" THEN "Absent" ELSE "" END) AS "statut"
, (TIME_TO_SEC(tup.end_time) - TIME_TO_SEC(tup.start_time)) / 3600 AS "duree"
, CONCAT(tup.start_time, " - ", tup.end_time) AS "heure"
, sest.signature_done AS "signed"
, usm.user_login AS "login"
, spm.spm_name AS "sp_name"
, spp.spp_explanation AS "gp_name"
, spp.ded AS "ded"
, spp.cost_center AS "cc"
, spp.level AS "level"
, cli.alias AS "client"
, spm.spm_start_date AS "sp_start"
, spm.spm_end_date AS "sp_end"
, IF(spm.spm_status = 0, "Open", "Closed") AS "sp_status"
, com.company_name AS "company_name"
FROM company_master AS com
JOIN client AS cli ON (cli.company_id = com.company_id)
JOIN convention_master AS cov ON (cov.client_id = cli.client_id)
JOIN study_path_product AS spp ON (spp.conv_id = cov.id)
JOIN product_master AS prm ON (spp.product_id = prm.product_id)
JOIN product_category AS prc ON (prm.category_id = prc.id)
JOIN language_master AS lam ON (prm.language_id = lam.language_id)
JOIN study_path_main AS spm ON (spm.study_path_main_id = spp.study_path_main_id)
JOIN user_master AS usm ON (usm.user_id = spm.user_id)
JOIN session_student AS sest ON (sest.user_id = spm.user_id)
JOIN tutor_planner AS tup ON (
tup.date BETWEEN spm.spm_start_date AND spm.spm_end_date
AND tup.product_id = spp.product_id
AND tup.session_id = sest.session_id
)
JOIN user_master AS tut ON (tup.user_id = tut.user_id)
JOIN center_master AS cem ON (tup.center_id = cem.center_id)
WHERE TRUE
AND com.company_id = 175
AND tup.date BETWEEN "2009-09-01" AND "2009-09-30"
AND prm.category_id <> 6
GROUP BY
usm.user_id
, sest.session_id
ORDER BY
`nom`
, `prenom`
, tup.date
;
The EXPLAIN
Code:
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
| 1 | SIMPLE | com | const | PRIMARY,CompanyID | PRIMARY | 4 | const | 1 | Using temporary; Using filesort |
| 1 | SIMPLE | cem | ALL | PRIMARY | NULL | NULL | NULL | 295 | |
| 1 | SIMPLE | tup | ref | PRIMARY,Tutor,Center,Date,Product,TutorDateCenter | Center | 5 | v8.cem.center_id | 12 | Using where |
| 1 | SIMPLE | prm | eq_ref | PRIMARY,Cat,Lang | PRIMARY | 4 | v8.tup.product_id | 1 | Using where |
| 1 | SIMPLE | lam | eq_ref | PRIMARY,language_id | PRIMARY | 4 | v8.prm.language_id | 1 | |
| 1 | SIMPLE | prc | eq_ref | PRIMARY | PRIMARY | 4 | v8.prm.category_id | 1 | |
| 1 | SIMPLE | tut | eq_ref | PRIMARY | PRIMARY | 98 | v8.tup.user_id | 1 | |
| 1 | SIMPLE | sest | ref | Protec1,Session,Learner | Protec1 | 98 | v8.tup.session_id | 1 | |
| 1 | SIMPLE | usm | eq_ref | PRIMARY | PRIMARY | 98 | v8.sest.user_id | 1 | |
| 1 | SIMPLE | spm | ref | PRIMARY,user_id,DateRange | user_id | 98 | v8.usm.user_id | 2 | Using where |
| 1 | SIMPLE | spp | eq_ref | SpProd | SpProd | 198 | v8.spm.study_path_main_id,v8.prm.product_id | 1 | Using where |
| 1 | SIMPLE | cov | eq_ref | PRIMARY,Client | PRIMARY | 4 | v8.spp.conv_id | 1 | |
| 1 | SIMPLE | cli | eq_ref | PRIMARY | PRIMARY | 4 | v8.cov.client_id | 1 | Using where |
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
- 20sec execution time
Here MySQL doesn't seem to take the good way to find the data,
I think the cem table should be at the end
Now change
Code:
< AND com.company_id = 175
> AND tup.center_id = 245
You will get
Code:
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
| 1 | SIMPLE | cem | const | PRIMARY | PRIMARY | 4 | const | 1 | Using temporary; Using filesort |
| 1 | SIMPLE | tup | ref | PRIMARY,Tutor,Center,Date,Product,TutorDateCenter | Center | 5 | const | 4682 | Using where |
| 1 | SIMPLE | prm | eq_ref | PRIMARY,Cat,Lang | PRIMARY | 4 | v8.tup.product_id | 1 | Using where |
| 1 | SIMPLE | prc | eq_ref | PRIMARY | PRIMARY | 4 | v8.prm.category_id | 1 | |
| 1 | SIMPLE | lam | eq_ref | PRIMARY,language_id | PRIMARY | 4 | v8.prm.language_id | 1 | |
| 1 | SIMPLE | tut | eq_ref | PRIMARY | PRIMARY | 98 | v8.tup.user_id | 1 | |
| 1 | SIMPLE | sest | ref | Protec1,Session,Learner | Protec1 | 98 | v8.tup.session_id | 1 | |
| 1 | SIMPLE | usm | eq_ref | PRIMARY | PRIMARY | 98 | v8.sest.user_id | 1 | |
| 1 | SIMPLE | spm | ref | PRIMARY,user_id,DateRange | user_id | 98 | v8.sest.user_id | 2 | Using where |
| 1 | SIMPLE | spp | eq_ref | SpProd | SpProd | 198 | v8.spm.study_path_main_id,v8.tup.product_id | 1 | |
| 1 | SIMPLE | cov | eq_ref | PRIMARY,Client | PRIMARY | 4 | v8.spp.conv_id | 1 | |
| 1 | SIMPLE | cli | eq_ref | PRIMARY | PRIMARY | 4 | v8.cov.client_id | 1 | |
| 1 | SIMPLE | com | eq_ref | PRIMARY,CompanyID | PRIMARY | 4 | v8.cli.company_id | 1 | |
+----+-------------+-------+--------+---------------------------------------------------+---------+---------+---------------------------------------------+------+---------------------------------+
- With 0.3sec execution time
In both cases, the same volume is fetched (~100 lines returned)
|
|

10-22-09, 04:59
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
|
Originally Posted by gtk
Rudy, what do you think ?
|
i think mike is doing a splendid job for you
your query is way too complicated for me

|
|

10-22-09, 05:14
|
|
Registered User
|
|
Join Date: Oct 2004
Location: Paris, FRANCE
Posts: 132
|
|
Rudy, this is the morning that is complicated, isn't it ? 
|
|

10-22-09, 05:17
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
Quote:
|
Originally Posted by r937
i think mike is doing a splendid job for you
|
I've already given up I'm afraid. I will quickly say that gtk's previous query just involved the field invoicing - now it appears that that this field doesn't appear at all within your SQL and there's now a join with 12 other tables but at least that explains why the index isn't used on that field. I suspect that the optimiser just takes one look at that query and simply throws up it's hands - that's certainly what I'm doing right now
|
Last edited by mike_bike_kite; 10-22-09 at 05:33.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|