Results 1 to 5 of 5
  1. #1
    Join Date
    Jan 2007
    Posts
    56

    Unanswered: Simple query optimization

    Hello,

    We have the following 2 tables (and indexes) in our database:

    Table 1
    -------
    CREATE TABLE linkentityarticle(id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, article_id INT UNSIGNED NOT NULL, entity_id INT UNSIGNED NOT NULL, score TINYINT UNSIGNED NOT NULL) Engine = InnoDB;

    Indexes
    -------
    1. entity_id
    2. entity_id + article_id
    3. article_id + entity_id
    4. score

    Total # rows in table: 10 million
    Total space used by table: 1.5 GB

    Table 2
    -------
    CREATE TABLE entity(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, node VARCHAR(255)) Engine = InnoDB;

    Indexes
    ------
    1. node

    Total # rows in table: 600,000
    Total space used by table: 88 MB

    When I run the following query against these tables, it runs for over 30 seconds:

    SELECT DISTINCT article_id
    FROM linkentityarticle d
    INNER JOIN
    (
    SELECT id FROM entity
    WHERE node IN('sports')

    ) AS T1
    ON T1.id = d.entity_id
    ORDER BY d.article_id DESC
    LIMIT 50;

    If I remove ORDER BY d.article_id DESC, it takes 0.3 seconds (i.e. 100 fold improvement).

    Below is the EXPLAIN PLAN WITH ORDER BY

    id: 1
    select_type: PRIMARY
    table: <derived2>
    type: system
    possible_keys: NULL
    key: NULL
    key_len: NULL
    ref: NULL
    rows: 1
    Extra: Using temporary; Using filesort
    *************************** 2. row ***************************
    id: 1
    select_type: PRIMARY
    table: d
    type: ref
    possible_keys: linkentityarticle_entity_id,index_linkentityarticl e_entity_i d_article_id
    key: linkentityarticle_entity_id
    key_len: 4
    ref: const
    rows: 6708
    Extra: Using where
    *************************** 3. row ***************************
    id: 2
    select_type: DERIVED
    table: drift_entity
    type: const
    possible_keys: unique_entity_node
    key: unique_entity_node
    key_len: 257
    ref:
    rows: 1
    Extra: Using index


    Below is the EXPLAIN WITHOUT ORDER BY:

    id: 1
    select_type: PRIMARY
    table: <derived2>
    type: system
    possible_keys: NULL
    key: NULL
    key_len: NULL
    ref: NULL
    rows: 1
    Extra: Using temporary
    *************************** 2. row ***************************
    id: 1
    select_type: PRIMARY
    table: d
    type: ref
    possible_keys: linkentityarticle_entity_id,index_linkentityarticl e_entity_i d_article_id
    key: linkentityarticle_entity_id
    key_len: 4
    ref: const
    rows: 6708
    Extra:
    *************************** 3. row ***************************
    id: 2
    select_type: DERIVED
    table: drift_entity
    type: const
    possible_keys: unique_entity_node
    key: unique_entity_node
    key_len: 257
    ref:
    rows: 1
    Extra: Using index


    We are using MySQL version 5.0.44 and our InnoDB Buffer Pool is set to 2GB.

    Is there anything that can be done to speed up this query when ORDER BY is used (as we need to use it to get the latest data).

    Thank you in advance!

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I only have 2 mins before nipping down the pub so this is my quick attempt
    Code:
    SELECT a.article_id
    FROM   entity e, linkentityarticle a
    where  e.node = 'sports'
           and a.entity_id = e.id
    ORDER BY a.article_id DESC
    LIMIT 50;
    If it doesn't help then I'd show the details about the indexes ie unique etc.

  3. #3
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    how about defining your index as entity_id, article_id desc?
    Dave

  4. #4
    Join Date
    Jan 2007
    Posts
    56
    mike_bike_kite,

    I tried your suggestion. The resulting performance did not change.

    The indexes are as such:

    linkentityarticle
    -----------------
    id (clustered index)
    article_id + entity_id (unique)
    entity_id + article_id

    entity
    ------
    id (clustered index)
    node (unique)

    dav1mo,

    I changed the index from entity_id + article_id to entity_id + article_id (DESC). The query performance did not change

    I am assuming filesort is slowing down the query (as it is much faster without the ORDER BY). Are there known ways to make sorting faster?

    Thanks a lot!

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by sql_er View Post
    I am assuming filesort is slowing down the query (as it is much faster without the ORDER BY). Are there known ways to make sorting faster?
    I presume you've run the queries without the order by and it's fast enough. I guess my next question is why are you ordering by article_id anyway - wouldn't it be an internal field that's not shown to the user so couldn't you stick with the default order returned? If it's because the ids are assigned in order and you want only the 50 latest articles then why not include a date the article was created into the table and then first run a fast query restricted to sports articles in the last 2 months ordered by either the article id or the date field. If you get 50 articles returned then just use those 50 articles. If you got less than 50 then run the old (slow) query to get a more complete set of sports articles.

    Are there lot's of different values for node as well as the sports value? If not then it might be skipping this index as there aren't enough different values. What percentage of the articles are sports articles?

    I think the create index syntax does allow you to specify either ASC or DESC when creating the index but sadly this functionality isn't implemented yet. If it's easy for you to change the indexes then I'd try the following. I didn't understand the point of clustering the id field in both tables so I removed that part.

    linkentityarticle
    -----------------
    id (index)
    entity_id + article_id (unique)
    article_id

    entity
    ------
    id (index)
    node (unique)
    node + id

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •