Results 1 to 8 of 8
  1. #1
    Join Date
    Dec 2008
    Posts
    4

    Unanswered: One JOIN: OK, Two JOIN: DISASTER

    Let me show you something:

    # Query_time: 1 Lock_time: 0 Rows_sent: 1 Rows_examined: 150000
    SELECT COUNT(DISTINCT(u.unit_id))
    FROM unit u
    RIGHT JOIN
    (SELECT up1.unit_id
    FROM unit_param up1
    WHERE up1.unit_type_param_id = 24
    AND up1.value = ‘ServiceA’
    ) nmp0
    ON u.unit_id = nmp0.unit_id;

    This query is fine. No problem. Now, I have to make another selection from the unit_param table, for which I need to do another JOIN in addition to the previous query. The first section of the query is exactly the same as the previous query. Beware:

    # Query_time: 1138 Lock_time: 0 Rows_sent: 0 Rows_examined: 2271979789
    SELECT COUNT(DISTINCT(u.unit_id))
    FROM unit u
    RIGHT JOIN
    (SELECT up1.unit_id
    FROM unit_param up1
    WHERE up1.unit_type_param_id = 24
    AND up1.value = ‘ServiceA’
    ) nmp0
    ON u.unit_id = nmp0.unit_id
    RIGHT JOIN
    (SELECT up1.unit_id
    FROM unit_param up1
    WHERE up1.unit_type_param_id = 23
    AND up1.value = ‘Bigland’
    ) nmp1
    ON u.unit_id = nmp1.unit_id;

    This query never responded, I had to cancel it (but not before it had run for 20min!! running on 100% CPU) However, the number of rows with “Bigland” was just 75K rows compared to rows with “ServiceA” which was 50K rows. The initial table (unit) was 100K rows. Each row record is approx. 500B. So adding ONE JOIN extra, with an additional 75K rows to JOIN, the query went from OK to a DISASTER!!!!

    The number of rows involved went from 150K rows to 2.2G rows!!!!! How on earth could this happen? Has the JOIN thing gone completely crazy??? What can I do about this? I have tried adjusting the join_buffer_size (128M and 16M) but to no avail.

    I must admit I am getting desperate now.

    Sincerly grateful for any input here!

    Morten Simonsen

  2. #2
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Can you let us know a little more about what you're actually trying to achieve?

    Looks to me like your query could be simplified somewhat...
    George
    Home | Blog

  3. #3
    Join Date
    Dec 2008
    Posts
    4
    Ok...here goes:

    I have a list of "units" in one table. These units can have many "properties". We have made unit_param table to hold these properties. Each row in unit_param holds exactly one property for one unit.

    Now, if I want to list all units with property A = Foo and property B = Bar, then I think I have to do it this way.

    Thanks for your interest

    Morten Simonsen

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    If you want those units with BOTH properties, then you should be using two INNER joins.
    George
    Home | Blog

  5. #5
    Join Date
    Dec 2008
    Posts
    4

    Smile

    Hi there

    I rewrote the SQL to:

    SELECT COUNT(tmp.unit_id)
    FROM
    (SELECT u.unit_id
    FROM unit u,
    unit_param up
    WHERE u.unit_id = up.unit_id
    AND up.value = 'ServiceA'
    ) tmp,
    unit_param up1
    WHERE tmp.unit_id = up1.unit_id
    AND up1.value = 'Bigland';

    It took only 5 sec now!!! Very good. I am grateful. Was this the way you intended?

    Still, there must be a serious bug in MySQL. My previous SQL worked fine on Oracle.

    Thanks a lot!!!!!

    Morten Simonsen

  6. #6
    Join Date
    Dec 2008
    Posts
    4
    Actually, I forgot a couple of things in my SQL:

    SELECT COUNT(tmp.unit_id)
    FROM
    (SELECT u.unit_id
    FROM unit u,
    unit_param up
    WHERE u.unit_id = up.unit_id
    AND up.unit_type_param_id = 24
    AND up.value = 'ServiceA'
    ) tmp,
    unit_param up1
    WHERE tmp.unit_id = up1.unit_id
    AND up1.unit_type_param_id = 23
    AND up1.value = 'Bigland';


    now it responds in 1 sec.



    Morten Simonsen

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by georgev
    If you want those units with BOTH properties, then you should be using two INNER joins.
    and if three, then three, and so on

    the problem is, this gets ugly

    the alternative (and i'm sure you've seen it before georgie), is simply to count them

    Code:
    SELECT COUNT(*)
      FROM (
           SELECT unit_id
             FROM unit u
           INNER
             JOIN unit_param up
               ON up.unit_id = u.unit_id 
            WHERE ( 
                  up.unit_type_param_id = 24
              AND up.value = 'ServiceA'
                  )
               OR (
                  up.unit_type_param_id = 23
              AND up.value = 'Bigland'
                  )
           GROUP
               BY unit_id
           HAVING COUNT(*) = 2
           ) AS d
    why is counting better? because then you can do queries such as "must have at least three of"

    example:
    Code:
    SELECT candidate_name
      FROM applications
     WHERE job_skill IN ( 'CSS','HTML','ASP','PHP','SQL')
    GROUP
        BY candidate_name
    HAVING COUNT(*) >= 3
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Quote Originally Posted by r937
    i'm sure you've seen it before georgie
    Absolutely! It's another favourite that I learned from none other than my ledherhosen-sporting peer
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •