Results 1 to 7 of 7
  1. #1
    Join Date
    May 2009
    Posts
    2

    Unanswered: Large Table and Complex Query

    Hi, I would like some advice as to whether or not I am going at this the wrong way or if any improvements could be made.

    I have a rather large table ~14.5 million records. The goal of this query is get download counts for products within the past 90 days.

    In addition to displaying counts next to product codes, a product title, and top tier category need to be displayed.

    On a small sampling of the tables this query seems to work fine:

    Code:
    SELECT download_log.code, patterns.title, tier_3.name, tier_2.name, tier_1.name, Count( download_log.code ) AS qty
    FROM (
    (
    (
    patterns__categories
    INNER JOIN (
    patterns
    INNER JOIN download_log ON patterns.code = download_log.code
    ) ON patterns__categories.pattern_id = patterns.pattern_id
    )
    INNER JOIN categories AS tier_3 ON patterns__categories.cat_id = tier_3.cat_id
    )
    INNER JOIN categories AS tier_2 ON tier_3.parent_id = tier_2.cat_id
    )
    INNER JOIN categories AS tier_1 ON tier_2.parent_id = tier_1.cat_id
    WHERE (
    '2009-02-07 00:00:00' < download_log.stamp
    )
    GROUP BY download_log.code;
    but with the actual information the query seems to hang.


    here is the basic table structure:



    +-------------------------------+
    |---------download_log----------|
    +-----+----+---------+----------+
    |stamp|code|member_id|ip_address|
    +-----+----+---------+----------+


    +--------------------------+
    |---------patterns---------|
    +----------+----+-----+----+
    |pattern_id|code|title|file|
    +----------+----+-----+----+


    +---------------------+
    |------categories-----|
    +------+---------+----+
    |cat_id|parent_id|name|
    +------+---------+----+


    +--------------------+
    |patterns__categories|
    +----------+------+--+
    |pattern_id|cat_id|
    +----------+------+


    Any suggestions would be greatly appreciated!
    Last edited by whiterguy2004; 05-07-09 at 22:40.

  2. #2
    Join Date
    Aug 2005
    Posts
    30
    Code:
    SELECT download_log.code, patterns.title, tier_3.name, tier_2.name, tier_1.name, Count( download_log.code ) AS qty
    FROM download_log LEFT JOIN patterns ON patterns.code = download_log.code) LEFT JOIN patterns__categories ON patterns__categories.pattern_id = patterns.pattern_id LEFT JOIN categories AS tier_3 ON patterns__categories.cat_id = tier_3.cat_id LEFT JOIN categories AS tier_2 ON tier_3.parent_id = tier_2.cat_id LEFT JOIN categories AS tier_1 ON tier_2.parent_id = tier_1.cat_id WHERE (
    '2009-02-07 00:00:00' < download_log.stamp
    )
    GROUP BY download_log.code;
    Try above query it will be much faster; but this will return only those products that are downloaded in last 90 days. If no download exist for a product it will not appear in results. You can run a separate query to get those which are not in these results.

  3. #3
    Join Date
    May 2009
    Posts
    2
    Thanks! That looks to be a better solution and should be exactly what I need. I will give it a go!

    EDIT: This ran fine after repairing the table...It took about 85 seconds. Thanks again!
    Last edited by whiterguy2004; 05-11-09 at 14:18.

  4. #4
    Join Date
    Apr 2009
    Posts
    3

    Question

    From the above it appears that using Inline Views in Mysql comes at a high cost in performance. Is this always the case?

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by DBMark
    From the above it appears that using Inline Views in Mysql comes at a high cost in performance. Is this always the case?
    absolutely not

    inline views have nothing to do with performance

    performance is all about the indexes

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Apr 2009
    Posts
    3

    Question

    Hi - Did you check the code before your reply? Yes, performance is normally all about having indexes (and persuading Mysql to utilise those indexes!). Yet the two different statements both seem to specify the joins using the same fields in the same way for example:

    WHERE '2009-02-07 00:00:00' < download_log.stamp
    So my question remains - why is the new code more efficient than the first? I may be missing something in the syntax but the only difference appears to be avoiding the use of an Inline view - the fields are joined without use of functions in both queries. Would the use of the syntax "LEFT JOIN" instead of "INNER JOIN" explain it?

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by DBMark
    Would the use of the syntax "LEFT JOIN" instead of "INNER JOIN" explain it?
    no, it would not

    the removal of the parentheses might, however

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •