Results 1 to 9 of 9
  1. #1
    Join Date
    May 2007
    Posts
    80

    Unanswered: Question on MAX() and how it selects other fields that are returned

    Got a table like this:

    g m o1 o2 o3
    A 1 x y z
    A 2 a b c
    A 3 e r t
    B 7 d r t
    B 8 w d t
    B 9 a s d


    SELECT g,max(m)
    FROM table
    GROUP BY g

    will give me 2 records grouped on 'g' with the maximum for 'm' per group
    A 3
    B 9


    SELECT g,max(m),o1,o2,o3
    FROM table
    GROUP BY g

    will give me 2 records grouped on 'g' with the maximum for 'm' per group
    and the additional fields i ask for
    A 3 x y z
    B 9 d r t

    Question:
    what do i need to do in the query to be absolutely certain that the additional fields i ask actually belong to the record found by max() . I tried this as above and the values returned for o1,o2,o3 appear to belong to the record which happens to be first in the group and not the one with the max(m) value.

    So the result i want is:
    A 3 e r t
    B 9 a s d

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by BettingSherlock View Post
    what do i need to do in the query to be absolutely certain that the additional fields i ask actually belong to the record found by max()
    please read this -- GROUP BY and HAVING with Hidden Columns

    Code:
    SELECT t.g
         , t.m 
         , t.o1 
         , t.o2 
         , t.o3
      FROM ( SELECT g
                  , MAX(m) AS max_m
               FROM daTable
             GROUP 
                 BY g ) AS s
    INNER
      JOIN daTable AS t
        ON t.g = s.g
       AND t.m = s.max_m
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    May 2007
    Posts
    80
    Thanks, hard to get this from the standard mysql documentation,
    ended up with the same join construction.

    Code:
    select    common_selectionID,
              handicapvalue,
              odds_decimal,
              odds_fractional1,
              odds_fractional2,
              odds_american
    FROM
    
      ( select    common_selectionID  AS sid,
                  handicapvalue       AS hval,
                  max(odds_decimal)   AS maxodds
        FROM      sbsvb_common_market_odds AS wo
        WHERE     wo.common_marketinstanceID = 138
        GROUP BY  wo.common_selectionID,handicapvalue) AS maxed
    
    LEFT JOIN sbsvb_common_market_odds as wo
              ON  wo.common_selectionID = maxed.sid
              AND wo.handicapvalue      = maxed.hval
              AND wo.odds_decimal       = maxed.maxodds
    ORDER BY  handicapvalue,common_selectionID;

    So select the groups and max() value, then join the table with itself to get the additional fields using the max() value found as condition for the join.
    Makes sense but does appear a bit cumbersome.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    two suggestions:

    first, don't use the same table alias ("wo") both inside and outside a subquery... you will surely one day run into trouble

    second, make it an INNER JOIN instead of a LEFT OUTER JOIN
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    May 2007
    Posts
    80
    i get the point about the alias AS 'wo'
    but why INNER JOIN ?

    I wrote LEFT JOIN because...
    the left side of the join is the select i already done and that returns a very limited number of records
    whereas the right side of the join is the original table containing thousands of record
    so...

    my logic says to use a left join as that joins a limited number of records,
    whereas an inner join would also attemps to find a join from right to left
    which in this particular case has no point as the returned records are the same

    ok, it may be logical but not neccesarily efficient, so did i miss something in the workings of the JOIN that would make INNER more efficient ?

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by BettingSherlock View Post
    my logic says ...
    i am sorry to have to tell you this, but your logic is wrong

    inner joins return only those rows which satisfy the join conditions

    the efficiency of this is determined by the database optimizer, but efficiency is not germane to the issue of which rows to return

    so inner joins return only those rows which satisfy the join conditions

    this would lead you to suspect that outer joins return some rows which don't satisfy the join conditions, and this is exactly what happens -- some of the rows returned are "outside" of the rows which satisfy the join conditions

    in a LEFT OUTER JOIN, these "outer" rows come from the left table, and in a RIGHT OUTER JOIN these "outer" rows come from the right table

    to put this a slightly different way, a LEFT OUTER JOIN returns all rows of the left table, whether or not there is a matching row from the right table, where "matching" means that they satisfy the join conditions

    and none of this has to do with efficiency

    okay, so in your example, the left table of the join is the subquery which finds the maximum per group, and the right table is the actual table itself

    there cannot possibly be a row in the left table (the subquery) which doesn't match a row from the right table

    therefore it should be an INNER JOIN

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    May 2007
    Posts
    80
    having read that a few times i now (reluctantly) agree,
    or at least i can follow the alternative logic

    ok, so now, in this case both join methods will actually result in exactly the same thing,
    so the question becomes which one is most efficient ?

    ( realising the possibility it may be a pointless question as there is the possibility the efficiency cannot be determined either way and should simply be left to the database server and therefore it would be a waste of time to even consider the question )

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by BettingSherlock View Post
    ... so the question becomes which one is most efficient ?
    best case scenario is they are equally efficient, and inner is never worse

    there is of course the other consideration: maintenance

    have you ever had to understand a query, either by taking over someone else's work, or your own, where you go "what was i trying to do here?"

    it helps if the query author wrote outer joins only when they are needed logically
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    May 2007
    Posts
    80
    who , me , nah , never


    I tend to adress the maintenance issue mainly by adding documentation and comments throughout my applications. If that doesn't do it the logic of how the SQL was arrived at won't matter much anyhow.

    Interesting conversation, thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •