Results 1 to 8 of 8
  1. #1
    Join Date
    Jun 2008
    Posts
    7

    Unanswered: Query too slow on select with order on large table

    Hello

    I have a table, named logs, with 60.000.000 rows.

    The table structure is:
    id BIGINT, user_id INT, page_viewed VARCHAR(255), last_visit BIGINT

    Id is primary key with autoincrement, user_id and last_visit are indexed.

    The second table named users, containing ~100.000 rows, has the following structure:

    id INT, user_name VARCHAR(255), banned BOOL, blocked INT

    Id is primary key, banned and blocked are indexed.

    I want to select from logs 200 rows ordered by last_visit ascending where user is not banned and blocked count smaller than 11.

    I have made the following query:

    Code:
    select id, page_viewed from `logs`
    
    where 
    (select banned from users where id=logs.user_id limit 1)=0 
    and 
    (select blocked from users where id=urls.user_id limit 1)<11
    
    order by `last_visit` asc
    limit 200
    This query takes ~9secs to complete on CORE2DUO E7300 with 4GB DDR2 RAM and 4 DRIVES SATA2 in RAID STRIPPING.

    Biggest table, logs has ~10GB with 7GB data and 3GB indexes.

    Is there any way to improve execution time?

    I'm struggling with this query in different modes but this is the fastest way until now.

    Please help!

    Thank you!

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Is there any way to improve execution time?
    Comments rather than a solution:
    • 60m rows is a little excessive for logs don't you think?
    • Index on logs is built on user id rather than date.
    • Order by and limit won't speed things up because it first has to order ALL the rows it finds and THEN take the first 200.
    • Better to not use sub queries and just do straight join.
    • The fact you've put a limit 1 inside the sub query suggests your user table has duplicates on user id.


    Suggestions:
    • Get rid of about 59m log records - do you really ever use them?
    • If you can use clustered indexes then add one on date and user id in that order.
    • Do a straight join on between the user table and the logs
    • Reduce the data being searched through by perhaps adding a query like date > a week ago.

    Just my 2c.

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Code:
    SELECT logs.id
         , logs.page_viewed 
         , logs.last_visit
      FROM logs
    INNER
      JOIN users
        ON users.id = logs.user_id 
       AND users.banned = 0
       AND users.blocked < 11
    ORDER 
        BY logs.last_visit ASC LIMIT 200
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jun 2008
    Posts
    7
    Thank you for sugested query, it is fast than my query with about 2 seconds.
    So i've got ~5sec per execution. I want something around max ~2sec, and i am afraid that it is not scalable too.

    LE: Sorry, it ran from cache, real time was 13secs.
    Last edited by koandy1983; 03-08-09 at 04:48.

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Did you consider any of the suggestions etc in my 1st post?

  6. #6
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Quote Originally Posted by koandy1983
    Thank you for sugested query, it is fast than my query with about 2 seconds.
    So i've got ~5sec per execution. I want something around max ~2sec, and i am afraid that it is not scalable too.

    LE: Sorry, it ran from cache, real time was 13secs.
    Standard steps to improve performance:

    1) Check the execution plan
    2) Create indexes on join or sort columns if not present
    3) Re-check the execution plan if the new index is used

  7. #7
    Join Date
    Jun 2008
    Posts
    7
    Quote Originally Posted by mike_bike_kite
    Did you consider any of the suggestions etc in my 1st post?
    I will try to make clustered indexes. Thank you.
    But how can you tell me to delete about 59m records?

    If some one ask some opinion on technology construction of 100 floors building, someone will come and say, "no way, 20floors maximum", without even try to plan?

  8. #8
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    But how can you tell me to delete about 59m records?
    OK then ,why do you need 60m log records?
    Admittedly if you have 1m users then 60m log records wouldn't be extravagant.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •