Results 1 to 9 of 9

Thread: MAX vs LIMIT 1

  1. #1
    Join Date
    Dec 2002
    Posts
    123

    Unanswered: MAX vs LIMIT 1

    Hi,

    What's a faster method for retrieving the last timestamp in a table that has thousands of records? Using the MAX function or LIMIT 1? Or does it not matter?

    SELECT * from tblname where id = 1234 ORDER BY datetime desc LIMIT 1
    OR
    SELECT MAX(datetime) from tblname where id = 1234

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    the two queries are not comparable, because of the columns they return

    and LIMIT only works in MySQL

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Dec 2002
    Posts
    123

    MAX vs LIMIT 1

    Hi,

    What's a faster method for retrieving the last timestamp in a table that has thousands of records? Using the MAX function or LIMIT 1? Or does it not matter?

    SELECT * from tblname where id = 1234 ORDER BY datetime desc LIMIT 1
    OR
    SELECT MAX(datetime) from tblname where id = 1234

  4. #4
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    The first is faster because it will already throw a syntax error during parsing/query compilation. The second may actually do some real work and, therefore, needs more time.

    Assuming that you would fix the 1st query and use FETCH FIRST 1 ROW ONLY, you should have a look at the access plans to see what happens. Maybe DB2 uses the very same access plan and, thus, there will be no difference.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  5. #5
    Join Date
    Dec 2004
    Location
    Italy
    Posts
    32
    Well, the question is not trivial.

    Generally speaking I'd prefere to use the MAX function to avoid a sort extra step caused by the ORDER BY clause.

  6. #6
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    How do you come to the conclusion that an ORDER BY requires a sort? ORDER BY only means that the DBMS has to return the rows in the specified order. Whether actually a sort is happening internally to produce the order or whether the DBMS does something else (like using an index, which already has the right order) is a completely different question.

    So what you have to do is to look at the access plans to figure what is really happening internally. I can very well imagine that DB2 transforms an ORDER BY ... FETCH FIRST 1 ROW ONLY into a simple descend along the right-most edges in a B-Tree to find the largest value - exactly like it would do it for MAX().

    Of course, this is dependent on the DBMS. Since you have shown NOT us valid DB2 syntax, we cannot be sure which system you are using.
    Last edited by stolze; 11-15-08 at 10:05.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  7. #7
    Join Date
    Dec 2004
    Location
    Italy
    Posts
    32
    I dared when I examined the plan_table.

  8. #8
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by db2user
    What's a faster method for retrieving the last timestamp in a table that has thousands of records? Using the MAX function or LIMIT 1? Or does it not matter?

    SELECT * from tblname where id = 1234 ORDER BY datetime desc LIMIT 1
    OR
    SELECT MAX(datetime) from tblname where id = 1234
    First of all, for a fair comparison, you should either replace "*" by "datetime" in the first query, or alternatively move the second query into a subquery:
    Code:
    SELECT datetime FROM tbl WHERE ... ORDER BY datetime DESC FETCH FIRST ROW ONLY
    versus
    SELECT MAX(datetime) FROM tbl WHERE ...
    or else
    Code:
    SELECT * FROM tbl WHERE ... ORDER BY datetime DESC FETCH FIRST ROW ONLY
    versus
    SELECT * FROM tbl WHERE datetime = (SELECT MAX(datetime) FROM tbl WHERE ...)
    The elements of the two pairs may even return different results, e.g. when several rows have the same timestamp.

    When no index exists on the datetime column, the two queries of the first interpretation will have exactly the same access path (viz. a table scan followed by a sorting).
    For the two last queries the second one will only have to sort the datetime column (like in the first two queries) but then re-access the table to get all matching rows. I.e.: two table scans.
    So it would need twice the I/O for the tablespace as compare to the third query, but that one will need more I/O (and CPU) during the sorting since it will have to sort the full table.

    Hence, the answer to the question "which is better" also depends on the sizes of the other columns in the table!

    When an index exists with datetime as its first column, access paths for the two variants for each interpretation will be identical (as Knut Stolze already pointed out).
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  9. #9
    Join Date
    Dec 2002
    Posts
    123
    thanks much for the replies! yes, you're correct.. I was thinking in terms of postgresql syntax.. what I really meant was 'FETCH FIRST 1 ROW ONLY' and yes, SELECT * would probably return a different result than SELECT MAX(date)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •