Results 1 to 5 of 5
  1. #1
    Join Date
    Sep 2014
    Posts
    48

    Unanswered: Differences Between Select Distinct And Group By

    I've been running queries in and effort to validate data in my database against a single table that is based on data extracted from the XML file used by iTunes. I first did groupby Artist and Title against both sources and a discrepancy occured. The groupby against the iTunes XML table shows I higher count of individual albums than the same query against my tables. However when I did a sekect distinct using the iTunes table the counts matched my tables. I can figure out why a group by would yeild a different record count. Anybody got any ideas?

  2. #2
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Code:
    GROUP BY Artist, Title
    Means that the query will yields one row for every row in the table where [Artist] and [Title] have different values. Also, you won't be able to use aggregate functions in the SELECT part of the query when using the DISTINCT clause and the way Null values are handled is different when using DISTINCT and when using GROUP BY.

    see: http://www.techonthenet.com/sql/distinct.php and http://www.techonthenet.com/sql/group_by.php
    Have a nice day!

  3. #3
    Join Date
    Sep 2014
    Posts
    48
    Correct me if I'm wrong here, but the group by would produce a list of unique albums for each artist. In other words you might see AC/DC Back In Black, AC/DC Black Ice and so on? The queries produced what appeared to be accurate results in that respect. However the query against the iTunes produced a higher number of records. 666 records in the iTunes query and 652 using my normalized table. The second query's record count is correct according to values show by iTunes. I fail to understand what would cause this discrepancy but using a select distinct produced an accurate value.

  4. #4
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Quote Originally Posted by VSCurtis View Post
    the group by would produce a list of unique albums for each artist.
    That's correct provided that you group by artist. If you group by artist and album, the query will yield a data set containing one row for each pair of (album - artist) in the rows of the source table.
    Have a nice day!

  5. #5
    Join Date
    Sep 2014
    Posts
    48
    both queries were grouped by artist and album the itunes table is completely unnormalized containing one row per MP3. My tables are normalized versions of the itunes table. The itunes table is the data source for the other tables. I'm still having difficulty understanding the reason for the discrepancy.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •