If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Query too slow on select with order on large table

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 03-07-09, 04:51
koandy1983 koandy1983 is offline
Registered User
 
Join Date: Jun 2008
Posts: 7
Query too slow on select with order on large table

Hello

I have a table, named logs, with 60.000.000 rows.

The table structure is:
id BIGINT, user_id INT, page_viewed VARCHAR(255), last_visit BIGINT

Id is primary key with autoincrement, user_id and last_visit are indexed.

The second table named users, containing ~100.000 rows, has the following structure:

id INT, user_name VARCHAR(255), banned BOOL, blocked INT

Id is primary key, banned and blocked are indexed.

I want to select from logs 200 rows ordered by last_visit ascending where user is not banned and blocked count smaller than 11.

I have made the following query:

Code:
select id, page_viewed from `logs`

where 
(select banned from users where id=logs.user_id limit 1)=0 
and 
(select blocked from users where id=urls.user_id limit 1)<11

order by `last_visit` asc
limit 200
This query takes ~9secs to complete on CORE2DUO E7300 with 4GB DDR2 RAM and 4 DRIVES SATA2 in RAID STRIPPING.

Biggest table, logs has ~10GB with 7GB data and 3GB indexes.

Is there any way to improve execution time?

I'm struggling with this query in different modes but this is the fastest way until now.

Please help!

Thank you!
Reply With Quote
  #2 (permalink)  
Old 03-07-09, 06:20
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Is there any way to improve execution time?
Comments rather than a solution:
  • 60m rows is a little excessive for logs don't you think?
  • Index on logs is built on user id rather than date.
  • Order by and limit won't speed things up because it first has to order ALL the rows it finds and THEN take the first 200.
  • Better to not use sub queries and just do straight join.
  • The fact you've put a limit 1 inside the sub query suggests your user table has duplicates on user id.

Suggestions:
  • Get rid of about 59m log records - do you really ever use them?
  • If you can use clustered indexes then add one on date and user id in that order.
  • Do a straight join on between the user table and the logs
  • Reduce the data being searched through by perhaps adding a query like date > a week ago.
Just my 2c.
Reply With Quote
  #3 (permalink)  
Old 03-07-09, 08:43
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Code:
SELECT logs.id
     , logs.page_viewed 
     , logs.last_visit
  FROM logs
INNER
  JOIN users
    ON users.id = logs.user_id 
   AND users.banned = 0
   AND users.blocked < 11
ORDER 
    BY logs.last_visit ASC LIMIT 200
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #4 (permalink)  
Old 03-08-09, 03:27
koandy1983 koandy1983 is offline
Registered User
 
Join Date: Jun 2008
Posts: 7
Thank you for sugested query, it is fast than my query with about 2 seconds.
So i've got ~5sec per execution. I want something around max ~2sec, and i am afraid that it is not scalable too.

LE: Sorry, it ran from cache, real time was 13secs.

Last edited by koandy1983; 03-08-09 at 03:48.
Reply With Quote
  #5 (permalink)  
Old 03-08-09, 07:37
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Did you consider any of the suggestions etc in my 1st post?
Reply With Quote
  #6 (permalink)  
Old 03-08-09, 09:21
shammat shammat is offline
Registered User
 
Join Date: Nov 2003
Posts: 2,407
Quote:
Originally Posted by koandy1983
Thank you for sugested query, it is fast than my query with about 2 seconds.
So i've got ~5sec per execution. I want something around max ~2sec, and i am afraid that it is not scalable too.

LE: Sorry, it ran from cache, real time was 13secs.
Standard steps to improve performance:

1) Check the execution plan
2) Create indexes on join or sort columns if not present
3) Re-check the execution plan if the new index is used
Reply With Quote
  #7 (permalink)  
Old 03-08-09, 11:15
koandy1983 koandy1983 is offline
Registered User
 
Join Date: Jun 2008
Posts: 7
Quote:
Originally Posted by mike_bike_kite
Did you consider any of the suggestions etc in my 1st post?
I will try to make clustered indexes. Thank you.
But how can you tell me to delete about 59m records?

If some one ask some opinion on technology construction of 100 floors building, someone will come and say, "no way, 20floors maximum", without even try to plan?
Reply With Quote
  #8 (permalink)  
Old 03-08-09, 11:35
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
But how can you tell me to delete about 59m records?
OK then ,why do you need 60m log records?
Admittedly if you have 1m users then 60m log records wouldn't be extravagant.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On