If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Simple query optimization

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-30-09, 16:27
sql_er sql_er is offline
Registered User
 
Join Date: Jan 2007
Posts: 48
Simple query optimization

Hello,

We have the following 2 tables (and indexes) in our database:

Table 1
-------
CREATE TABLE linkentityarticle(id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, article_id INT UNSIGNED NOT NULL, entity_id INT UNSIGNED NOT NULL, score TINYINT UNSIGNED NOT NULL) Engine = InnoDB;

Indexes
-------
1. entity_id
2. entity_id + article_id
3. article_id + entity_id
4. score

Total # rows in table: 10 million
Total space used by table: 1.5 GB

Table 2
-------
CREATE TABLE entity(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, node VARCHAR(255)) Engine = InnoDB;

Indexes
------
1. node

Total # rows in table: 600,000
Total space used by table: 88 MB

When I run the following query against these tables, it runs for over 30 seconds:

SELECT DISTINCT article_id
FROM linkentityarticle d
INNER JOIN
(
SELECT id FROM entity
WHERE node IN('sports')

) AS T1
ON T1.id = d.entity_id
ORDER BY d.article_id DESC
LIMIT 50;

If I remove ORDER BY d.article_id DESC, it takes 0.3 seconds (i.e. 100 fold improvement).

Below is the EXPLAIN PLAN WITH ORDER BY

id: 1
select_type: PRIMARY
table: <derived2>
type: system
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: d
type: ref
possible_keys: linkentityarticle_entity_id,index_linkentityarticl e_entity_i d_article_id
key: linkentityarticle_entity_id
key_len: 4
ref: const
rows: 6708
Extra: Using where
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: drift_entity
type: const
possible_keys: unique_entity_node
key: unique_entity_node
key_len: 257
ref:
rows: 1
Extra: Using index


Below is the EXPLAIN WITHOUT ORDER BY:

id: 1
select_type: PRIMARY
table: <derived2>
type: system
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
Extra: Using temporary
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: d
type: ref
possible_keys: linkentityarticle_entity_id,index_linkentityarticl e_entity_i d_article_id
key: linkentityarticle_entity_id
key_len: 4
ref: const
rows: 6708
Extra:
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: drift_entity
type: const
possible_keys: unique_entity_node
key: unique_entity_node
key_len: 257
ref:
rows: 1
Extra: Using index


We are using MySQL version 5.0.44 and our InnoDB Buffer Pool is set to 2GB.

Is there anything that can be done to speed up this query when ORDER BY is used (as we need to use it to get the latest data).

Thank you in advance!
Reply With Quote
  #2 (permalink)  
Old 10-30-09, 16:46
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
I only have 2 mins before nipping down the pub so this is my quick attempt
Code:
SELECT a.article_id
FROM   entity e, linkentityarticle a
where  e.node = 'sports'
       and a.entity_id = e.id
ORDER BY a.article_id DESC
LIMIT 50;
If it doesn't help then I'd show the details about the indexes ie unique etc.
Reply With Quote
  #3 (permalink)  
Old 11-03-09, 14:09
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
how about defining your index as entity_id, article_id desc?
Dave
Reply With Quote
  #4 (permalink)  
Old 11-04-09, 13:03
sql_er sql_er is offline
Registered User
 
Join Date: Jan 2007
Posts: 48
mike_bike_kite,

I tried your suggestion. The resulting performance did not change.

The indexes are as such:

linkentityarticle
-----------------
id (clustered index)
article_id + entity_id (unique)
entity_id + article_id

entity
------
id (clustered index)
node (unique)

dav1mo,

I changed the index from entity_id + article_id to entity_id + article_id (DESC). The query performance did not change

I am assuming filesort is slowing down the query (as it is much faster without the ORDER BY). Are there known ways to make sorting faster?

Thanks a lot!
Reply With Quote
  #5 (permalink)  
Old 11-04-09, 14:23
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by sql_er View Post
I am assuming filesort is slowing down the query (as it is much faster without the ORDER BY). Are there known ways to make sorting faster?
I presume you've run the queries without the order by and it's fast enough. I guess my next question is why are you ordering by article_id anyway - wouldn't it be an internal field that's not shown to the user so couldn't you stick with the default order returned? If it's because the ids are assigned in order and you want only the 50 latest articles then why not include a date the article was created into the table and then first run a fast query restricted to sports articles in the last 2 months ordered by either the article id or the date field. If you get 50 articles returned then just use those 50 articles. If you got less than 50 then run the old (slow) query to get a more complete set of sports articles.

Are there lot's of different values for node as well as the sports value? If not then it might be skipping this index as there aren't enough different values. What percentage of the articles are sports articles?

I think the create index syntax does allow you to specify either ASC or DESC when creating the index but sadly this functionality isn't implemented yet. If it's easy for you to change the indexes then I'd try the following. I didn't understand the point of clustering the id field in both tables so I removed that part.

linkentityarticle
-----------------
id (index)
entity_id + article_id (unique)
article_id

entity
------
id (index)
node (unique)
node + id
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On