Page 1 of 2 12 LastLast
Results 1 to 15 of 21

Thread: query help

  1. #1
    Join Date
    Nov 2008
    Posts
    20

    Unanswered: query help

    Hello,
    I've a table ORD(ordid, custid, total)
    customer's orders where
    custid is customer's ID and total is payment
    I need to select those customers where every payment is greater
    than average.
    I tried:

    Code:
    select custid
    from ord , (select avg(total) as average from ord) dv
    where total > dv.average
    but it gives me the customers which also has payment less than average.
    how can i fix that it gives me only those who are greater?

  2. #2
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Your query seems fine to me. I would suspect a problem in your DBMS.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  3. #3
    Join Date
    Dec 2008
    Location
    At work...
    Posts
    92
    Move/copy the sub-query to the where-clause.

    select custid
    from ord , (select avg(total) from ord) dv
    where total > (select avg(total) from ord)

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    and then remove the subquery from the FROM clause
    Code:
    SELECT custid
      FROM ord 
     WHERE total > (SELECT AVG(total) FROM ord)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Semantically, this is not necessary. The subquery constructs a table with 1 row and 1 column, and that is joined to each row in the ORD table. Isn't that so? Therefore, I still claim that the query is valid and the DBMS executing the query screws up, i.e. has a bug.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  6. #6
    Join Date
    Dec 2008
    Location
    At work...
    Posts
    92
    Select-list expressions (e.g. the subquery discussed) can not be used in the where clause.

    Here's Celko's standard reply:
    "Here is how a SELECT works in Standard SQL ... at least in theory. Real
    products will optimize things when they can.

    a) Start in the FROM clause and build a working table from all of the
    joins, unions, intersections, and whatever other table constructors are
    there. The table expression> AS <correlation name> option allows you
    give a name to this working table which you then have to use for the
    rest of the containing query.

    b) Go to the WHERE clause and remove rows that do not pass criteria;
    that is, that do not test to TRUE (reject UNKNOWN and FALSE). The WHERE
    clause is applied to the working set in the FROM clause.

    c) Go to the optional GROUP BY clause, make groups and reduce each
    group to a single row, replacing the original working table with the new
    grouped table. The rows of a grouped table must be group
    characteristics: (1) a grouping column (2) a statistic about the group
    (i.e. aggregate functions) (3) a function or (4) an expression made up
    those three items.

    d) Go to the optional HAVING clause and apply it against the grouped
    working table; if there was no GROUP BY clause, treat the entire table
    as one group.

    e) Go to the SELECT clause and construct the expressions in the list.
    This means that the scalar subqueries, function calls and expressions in
    the SELECT are done after all the other clauses are done. The “AS”
    operator can also give names to expressions in the SELECT list. These
    new names come into existence all at once, but after the WHERE clause,
    GROUP BY clause and HAVING clause has been executed; you cannot use them
    in the SELECT list or the WHERE clause for that reason.

    If there is a SELECT DISTINCT, then redundant duplicate rows are
    removed. For purposes of defining a duplicate row, NULLs are treated as
    matching (just like in the GROUP BY).

    f) Nested query expressions follow the usual scoping rules you would
    expect from a block structured language like C, Pascal, Algol, etc.
    Namely, the innermost queries can reference columns and tables in the
    queries in which they are contained.

    --CELKO-- "

    As you can see, the result set rows are already defined when the select-list expressions are processed.

  7. #7
    Join Date
    Nov 2008
    Posts
    20
    Thanks for your replies, i've fixed the query to:
    Code:
    select custid from ord
    where custid not in 
    (select custid from ord, (select avg(total) as average from ord) dv
    where total < dv.average;
    now everyone who has values below average are removed and those who has
    values above average only are shown.

  8. #8
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    @JarHL: I don't see how you could possibly interpret the explanations you cited as an argument against subselects in a WHERE clause or SELECT list.

    A scalar subselect is an expression. Based on that, you can use subselects in both, a WHERE clause and the SELECT list. Even more so, there are predicates like EXISTS, IN, =ANY, =ALL, etc. that operate on whole sets, i.e. subselects.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  9. #9
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by JarlH
    As you can see, the result set rows are already defined when the select-list expressions are processed.
    that's true, eh
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by khdani
    Thanks for your replies, i've fixed the query to:
    you made it worse

    the fact that it might actually be returning what you think it should be returning is just as mysterious as why your first query wasn't returning what you think it should be returning

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    So it comes down to either throw away your DBMS or to get it fixed.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  12. #12
    Join Date
    Nov 2008
    Posts
    20
    you made it worse

    the fact that it might actually be returning what you think it should be returning is just as mysterious as why your first query wasn't returning what you think it should be returning
    why ? the result of first query was incorrect.
    let's say the table has those tuples:
    ORD(ordid,custid,total)

    (100,1,50)
    (101,2,56)
    (102,3,48)
    (103,3,52)

    the average is 51.5
    the first query will return custid 1,2 and 3 (first, second and last row)
    but this is not correct because custid 3 has also an order which is less than
    average. but the problem was to select only those customers whose EVERY order is above average.
    so the second query will return the correct result, custid 1 and 2.

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    oh, that's pretty cool, i understand

    now that you've explained what you were doing

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Code:
    SELECT custid
      FROM ord 
    GROUP
        BY custid
    HAVING COUNT(*) =
           SUM(CASE WHEN total > (SELECT AVG(total) FROM ord)
                    THEN 1 ELSE NULL END)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  15. #15
    Join Date
    Nov 2008
    Posts
    20
    wow! that's cool haven't learnt that yet
    why count(*) ? i don't quite understand the logic of having count(*)...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •