Results 1 to 10 of 10
  1. #1
    Join Date
    Jan 2013
    Posts
    6

    Poor performance when using pagination and order_by

    I'm creating a web site using Django framework + mysql
    I have ~700 records spreading on 2 tables.

    I only give option to 5 pages not to whole results.

    The problem is that some queries take ~10 seconds which is way too much.
    and some take couple of ms.
    I indexed the relevant attributes that are part of the order_by query.

    I'm out of ideas what could have damaged mysql performance .

    This is the query that takes ~10 seconds:
    SELECT companies_company.id, companies_company.name, companies_company.sector, companies_company.sub_sector, companies_company.num_of_employees, companies_company.address, companies_company.home_page, companies_company.career_url, companies_company.city, companies_company.updating_date, COUNT(companies_selectedusercompanies.user_id) AS num_users FROM companies_company LEFT OUTER JOIN companies_selectedusercompanies ON (companies_company.id = companies_selectedusercompanies.company_id) WHERE companies_company.id IN (SELECT U0.id FROM companies_company U0 LEFT OUTER JOIN companies_selectedusercompanies U1 ON (U0.id = U1.company_id) GROUP BY U0.id, U0.name, U0.sector, U0.sub_sector, U0.num_of_employees, U0.address, U0.home_page, U0.career_url, U0.city, U0.updating_date ORDER BY NULL) GROUP BY companies_company.id, companies_company.name, companies_company.sector, companies_company.sub_sector, companies_company.num_of_employees, companies_company.address, companies_company.home_page, companies_company.career_url, companies_company.city, companies_company.updating_date ORDER BY companies_company.sector ASC LIMIT 20 OFFSET 340

    couple of ms:
    SELECT companies_company.id, companies_company.name, companies_company.sector, companies_company.sub_sector, companies_company.num_of_employees, companies_company.address, companies_company.home_page, companies_company.career_url, companies_company.city, companies_company.updating_date, COUNT(companies_selectedusercompanies.user_id) AS num_users FROM companies_company LEFT OUTER JOIN companies_selectedusercompanies ON (companies_company.id = companies_selectedusercompanies.company_id) WHERE companies_company.id IN (SELECT U0.id FROM companies_company U0 LEFT OUTER JOIN companies_selectedusercompanies U1 ON (U0.id = U1.company_id) GROUP BY U0.id, U0.name, U0.sector, U0.sub_sector, U0.num_of_employees, U0.address, U0.home_page, U0.career_url, U0.city, U0.updating_date ORDER BY NULL) GROUP BY companies_company.id, companies_company.name, companies_company.sector, companies_company.sub_sector, companies_company.num_of_employees, companies_company.address, companies_company.home_page, companies_company.career_url, companies_company.city, companies_company.updating_date ORDER BY companies_company.sector ASC LIMIT 20 OFFSET 160

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,479
    First of all,
    you should format your query, like...
    (I added aliases("AS cc" and "AS suc") and removed unnecessary parentheses.)
    Code:
    SELECT cc.id
         , cc.name
         , cc.sector
         , cc.sub_sector
         , cc.num_of_employees
         , cc.address
         , cc.home_page
         , cc.career_url
         , cc.city
         , cc.updating_date
         , COUNT(suc.user_id) AS num_users
     FROM  companies_company               AS cc
     LEFT  OUTER JOIN
           companies_selectedusercompanies AS suc
      ON   cc.id = suc.company_id
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
                LEFT  OUTER JOIN
                      companies_selectedusercompanies U1
                 ON   U0.id = U1.company_id
                GROUP BY
                      U0.id
                    , U0.name
                    , U0.sector
                    , U0.sub_sector
                    , U0.num_of_employees
                    , U0.address
                    , U0.home_page
                    , U0.career_url
                    , U0.city
                    , U0.updating_date
                ORDER BY
                      NULL
              )
     GROUP BY
           cc.id
         , cc.name
         , cc.sector
         , cc.sub_sector
         , cc.num_of_employees
         , cc.address
         , cc.home_page
         , cc.career_url
         , cc.city
         , cc.updating_date
     ORDER BY
           cc.sector ASC
     LIMIT 20 OFFSET 340
    Briefly looking, both queries may be same except "LIMIT 20 OFFSET 160" or " LIMIT 20 OFFSET 340".


    Second,
    you should know how OUTER JOIN works, (in other words, role of ON condiions in OUTER JOIN).
    that is,
    All rows of outer table(i.e. left table of left outer join) are selected
    regardless of ON condiions(even if the ON condition includes some columns of outer table).
    ON condiions in OUTER JOIN specifies conditions to select(and join) rows of inner table(i.e. right table of left outer join).


    So, the "LEFT OUTER JOIN" in subquery in WHERE clause is not necessary.
    GROUP BY and ORDER BY are also not necessary.
    Code:
    ...
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
              /*
                LEFT  OUTER JOIN
                      companies_selectedusercompanies U1
                 ON   U0.id = U1.company_id
              */
              /*
                GROUP BY
                      U0.id
                    , U0.name
                    , U0.sector
                    , U0.sub_sector
                    , U0.num_of_employees
                    , U0.address
                    , U0.home_page
                    , U0.career_url
                    , U0.city
                    , U0.updating_date
              */
              /*
                ORDER BY
                      NULL
              */
              )
    ...
    Remove unneccesary codes ...
    Code:
    ...
     FROM  companies_company AS cc
    ...
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
              )
    ...
    "FROM companies_company U0" in the subquery is the same table as outer "FROM companies_company",
    then whole WHERE condition is not neccesary.


    As a cosequence,
    your query might be equivalent to...
    Code:
    SELECT id
         , name
         , sector
         , sub_sector
         , num_of_employees
         , address
         , home_page
         , career_url
         , city
         , updating_date
         , (SELECT COUNT(suc.user_id)
             FROM  companies_selectedusercompanies AS suc
             WHERE suc.company_id = cc.id
           ) AS num_users
     FROM  companies_company AS cc
     ORDER BY
           sector ASC
     LIMIT 20 OFFSET 340
               /* or 160 */
    ;

  3. #3
    Join Date
    Jan 2013
    Posts
    6
    thanks for the detailed replay.
    I think you omitted the IN query, I'm filtering company.id to get only required ids.

    The query was auto generated by Django.

    I could use raw sql.

    I noticed that the problem was with the count query that depends on second table which can be omitted.

    so I added another field that being incrmented so I only need to select this field and to spare the count query.

    What do you think about this solution?
    Should I change it and use the count query as you mentioned?
    Last edited by embedded; 01-08-13 at 13:49.

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,479
    I think you omitted the IN query, I'm filtering company.id to get only required ids.
    I had described the reason why the IN clause is not neccesary.
    I'll rephrase the reason again.

    (1) LEFT OUTER JOIN, GROUP BY and ORDER BY in IN-subquery are logically not neccesary.
    Because, all ids in companies_company U0 are included in the result of the subquery,
    wheather LEFT OUTER JOIN, GROUP BY and ORDER BY were specified or not.
    Code:
    ...
     FROM  companies_company               AS cc
    ...
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
                LEFT  OUTER JOIN
                      companies_selectedusercompanies U1
                 ON   U0.id = U1.company_id
                GROUP BY
                      U0.id
                    , U0.name
                    , U0.sector
                    , U0.sub_sector
                    , U0.num_of_employees
                    , U0.address
                    , U0.home_page
                    , U0.career_url
                    , U0.city
                    , U0.updating_date
                ORDER BY
                      NULL
              )
    ...
    (2) Then, above part(FROM ... WHERE) will be equivalent to the following.
    Code:
    ...
     FROM  companies_company AS cc
    ...
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
              )
    ...
    The IN-subquery will select all ids in companies_company table as I wrote in (1).
    So, the IN predicate will filter out no row. All rows in companies_company table satisfy the IN predicate.
    If strictly speaking, id having NULL value in companies_company table will be excluded(filtered out) by the IN predicate.


    I noticed that the problem was with the count query that depends on second table which can be omitted.

    so I added another field that being incrmented so I only need to select this field and to spare the count query.
    I want to recomend you to try my suggested query, and see the resulting output and performance.
    Code:
    SELECT id
         , name
         , sector
         , sub_sector
         , num_of_employees
         , address
         , home_page
         , career_url
         , city
         , updating_date
         , (SELECT COUNT(suc.user_id)
             FROM  companies_selectedusercompanies AS suc
             WHERE suc.company_id = cc.id
           ) AS num_users
     FROM  companies_company AS cc
     ORDER BY
           sector ASC
     LIMIT 20 OFFSET 340
               /* or 160 */
    ;

  5. #5
    Join Date
    Jan 2013
    Posts
    6
    If I understand you correctly you select all rows from companies.company.
    This was not my intension.
    I need filter by companies.I'd and select all records.

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,479
    If I understand you correctly you select all rows from companies.company.
    This was not my intension.
    But, I explained(twice) that your query will select all rows from companies_company.

    I need filter by companies.I'd and select all records.
    If you want to filter out by some conditions,
    please clarify the conditions.

    I want to repeat again that your IN predicate will filter out nothing.
    All rows(or ids except NULL value) of companies_company satisfy the IN predicate,
    even if there was no corresponding row of companies_selectedusercompanies satisfying "ON U0.id = U1.company_id".
    Code:
     WHERE cc.id
           IN (SELECT U0.id
                FROM  companies_company U0
                LEFT  OUTER JOIN
                      companies_selectedusercompanies U1
                 ON   U0.id = U1.company_id
                GROUP BY
                      U0.id
                    , U0.name
                    , U0.sector
                    , U0.sub_sector
                    , U0.num_of_employees
                    , U0.address
                    , U0.home_page
                    , U0.career_url
                    , U0.city
                    , U0.updating_date
                ORDER BY
                      NULL
              )

  7. #7
    Join Date
    Jan 2013
    Posts
    6
    Thanks I got it from your previous answer.
    How would you first filter on company.Id?

    First filter and then join?

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,479
    You may want to specify the conditions to filter out in WHERE clause.

    Though, I don't know internal behavior of MySQL,
    MySQL may apply the conditions(in WHERE clause) before joining, if it was logically possible and MySQL was clever enough.

  9. #9
    Join Date
    Jan 2013
    Posts
    6
    Thanks.
    Will keep you posted afrer I integrate it to my project.

  10. #10
    Join Date
    Jan 2013
    Posts
    6
    This did the trick.
    No joins.
    Queries now take couple of ms.

    Thanks

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •