Results 1 to 6 of 6
  1. #1
    Join Date
    Apr 2003
    Posts
    6

    Unanswered: GROUP BY differences vs Sybase

    Hi

    I have a simple table containing a checksum and a value.

    create table test (checksum varchar(10) not null, value varchar(10) not null);
    insert test values ("sum1", "value1");
    insert test values ("sum1", "value2");
    insert test values ("sum2", "value3");

    Now I want to find all duplicate checksums and their respective values.

    In sybase I run this query:

    select checksum, value from test
    group by checksum
    having count(checksum) > 1

    This produces two rows:

    sum1, value1
    sum1, value2

    However, when I run this query in MySQL, I only get one row:

    sum1, value1

    Any clues on how to produce the same results as in Sybase, i.e. finding all duplicate checksums and their respective values? And why is the result different? Probably something simple but I can't figure it out...

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    In sybase I run this query:

    select checksum, value from test
    group by checksum
    having count(checksum) > 1
    i'm surprised sybase lets you do that

    it's not valid sql

    ordinarily, i would expect to achieve the results you indicated like this:
    Code:
    select checksum, value 
      from test
     where checksum in
           ( select checksum
               from test
           group by checksum
           having count(checksum) > 1
           )
    and, of course, you can't do that in mysql until version 4.1


    rudy
    http://r937.com/

  3. #3
    Join Date
    Apr 2003
    Posts
    6
    Thanks for your reply, but I don't get what you mean by valid sql. Valid compared to what? ANSI SQL standards? There are a lot of SQL that's allowed that is not standard. It's allowed in MySQL as well even though the results are different, guess that's what you mean.

    Until subqueries I'll stick with creating a temp-table.

    thanks

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    it's not valid by the standard, and it's not valid in any other database that i know of (except mysql, which is a separate rant), and it doesn't even make sense

    GROUP BY checksum means you get one row per checksum -- that's the definition of GROUP BY

    how can you select checksum and foo if you group by checksum? which value of foo do you expect to see? there's no way you should get more than one row with the same value of checksum, so which value of foo should the database pick for you? ok, if you had selected checksum and max(foo), that would be completely different, but you didn't

    that's why i say it doesn't even make sense

    i don't have sybase so i can't test it, but i am totally shocked that it would allow that particular query to get past its syntax checker


    rudy

  5. #5
    Join Date
    Apr 2003
    Posts
    6
    Ok, Sybase might be the reason for me not knowing the group by standard (me not fully knowing SQL at all might be the other). In my T-SQL manual (based on 11.5.1) there is an example when they run a group by query without aggregate

    select type, advance
    from titles
    group by type

    And the manual says that such constructions has limited functionality and sometimes produce confusing results blah blah blah.

    In the example query, multiple rows by type is returned and the manual says that without aggregate for the advance column, the query returns values for every row in the table.

    Thanks for clearing it out!

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    since this is the mysql forum, you will want to be aware that mysql does let you write non-standard GROUP BY queries, but they warn you about it:
    MySQL has extended the use of GROUP BY. You can use columns or calculations in the SELECT expressions that don't appear in the GROUP BY part. This stands for any possible value for this group. You can use this to get better performance by avoiding sorting and grouping on unnecessary items. For example, you don't need to group on customer.name in the following query:
    Code:
    mysql> SELECT order.custid,customer.name,MAX(payments)
        ->        FROM order,customer
        ->        WHERE order.custid = customer.custid
        ->        GROUP BY order.custid;
    In standard SQL, you would have to add customer.name to the GROUP BY clause. In MySQL, the name is redundant if you don't run in ANSI mode.

    Don't use this feature if the columns you omit from the GROUP BY part aren't unique in the group! You will get unpredictable results.

    http://www.mysql.com/doc/en/GROUP-BY-Functions.html
    note the empahisis on "Don't use this feature" is theirs, not mine

    rudy
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •