Results 1 to 14 of 14

Thread: Select Distinct

  1. #1
    Join Date
    May 2004
    Posts
    19

    Unanswered: Select Distinct

    Hi,

    I wonder if anyone here can shed some light on why the query below produces duplicate EmailAddress values even though we specify the DISTINCT clause.

    SELECT DISTINCT(EmailAddress) SubscriberID, FirstName, Surname, SubscriberID
    FROM TestMailingList
    ORDER BY EmailAddress

    Thanks.

  2. #2
    Join Date
    Jul 2002
    Posts
    229
    Because of the other columns in the column list?

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    what coolberg said!

    DISTINCT is not a function
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    May 2004
    Posts
    19
    Thanks for the input guys, but I am not any the wiser then, are you both saying that due to each record having a unique ID that the distinct clause will not suffice for what I am trying to achieve?

    If so, are there any alternative ways to get the duplicated email addresses out of the resultset?

  5. #5
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Quote Originally Posted by g2bam
    If so, are there any alternative ways to get the duplicated email addresses out of the resultset?
    Code:
    SELECT DISTINCT EmailAddress
    FROM TestMailingList
    ORDER BY EmailAddress
    Will not give you any duplicate email address.

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    this should help --
    Code:
    select EmailAddress
         , SubscriberID
         , FirstName
         , Surname
         , SubscriberID
      from TestMailingList
     where EmailAddress IN
         ( select EmailAddress             
             from TestMailingList          
           group                           
               by EmailAddress             
           having count(*) > 1 )    
    order
        by EmailAddress
         , SubscriberID
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    @r937
    I might misunderstand the poster's intention, but I thought he does not want duplicate email addresses. Your query will return those entries where there are duplicate email addresses (and only those)

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    yes, of course -- i posted my query to assist him in finding the dupes

    obviously, you find the dupes and then you remove them, eh

    after that, there is no more need for the DISTINCT or anything like it, and naught left to do but declare a UNIQUE constraint so that the dupes don't ever crop up again

    ah, the perils of the dreaded auto-incrementing surrogate key...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Well we don't know what the requirements are.
    Maybe duplicate emails are allowed, he just wants to prevent sending emails twice.

    We will never know until he tells us

  10. #10
    Join Date
    May 2004
    Posts
    19
    Yes duplicates are allowed as unfortunately they get imported from a client's own spreadsheet so we have no control over how they initially manage their captures.

    But like the man said, we don't want to send duplicate mails to the list. I note whem we omit the SubscriberID from the query we get what we want as far as email addresses go, but we also do need the ID's to go back to the client.

    So I hope that helps a bit for all you SQL guru's.

  11. #11
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Quote Originally Posted by g2bam
    I note whem we omit the SubscriberID from the query we get what we want as far as email addresses go, but we also do need the ID's to go back to the client
    The question is: which subscriberID do you want, if you have more than one email address?

  12. #12
    Join Date
    May 2004
    Posts
    19
    That's a very good question, but it doesn't matter which ID we pick up so long as we get one, so no preference here.

  13. #13
    Join Date
    Jan 2003
    Location
    Pittsburgh, PA
    Posts
    86

    Try this

    Quote Originally Posted by g2bam
    That's a very good question, but it doesn't matter which ID we pick up so long as we get one, so no preference here.
    You can probably get the results you need with out using 'Distinct'. Try a GROUP in a subgroup by EmailAddress, as follows:


    select EmailAddress, FirstName, Surname, T.SubscriberID
    from TestMailingList T
    join (select EmailAddress, Max(SubscriberID) as SubscriberID
    from TestMailingList
    group by EMailAddress) as M
    on M.SubscriberID = T.SubscriberID and T.EmailAddress = M.EmailAddress

    I didn't know if your SubscriberID is unique... if it is, then you don't need T.EmailAddress = M.EmailAddress in your join.

    Since you don't care which SubscriberID is returned (if multiple have the same e-mail address) this will give you the last SubscriberID for each EmailAddress. You could just as simply used MIN if you wanted the first SubscriberID for each EmailAddress.

  14. #14
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    select EmailAddress, SubscriberID, FirstName, Surname
    from TestMailingList a
    where a.SubscriberID=
    (select max(b.SubscriberID) from TestMailingList b
    where a.EmailAddress = b.EmailAddress)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •