Results 1 to 3 of 3
  1. #1
    Join Date
    Dec 2007

    Lightbulb Unanswered: Need help with complex query

    Hi All

    I need some help with a complex query (or maybe it's not that complex for the experts here).

    I have a table which contains a large amount of data. Among them a lot of duplicates. below tablestructure :

    1 12/12/1980 John
    2 John
    3 11/11/1970 Lisa

    Now, as you can see , the record is a duplicate in the table. However, the first record with contains more data then the second one (birthday field is completed in the first record).

    I would like to deduplicate my table but retain the record with the most data in it. So in this case the second record (ID 2) should be deleted since it contains less data then ID 1

    Is this possible?
    And if so , how would you proceed ?

    thank you in advance,

  2. #2
    Join Date
    Feb 2004
    Bangalore, India
    what is the criteria of more data/less data? are there any mandatory attributes that help define this more data criteria? like name/ DOB ?

    -- query to find duplicate records
    Select EMail from <tablename> Group by Email having count(1) >1

    --query to find the duplicate records with more data ( incase, records with name and dob)
    Select T.ID, T.Name, T.DOB, T.Email
    From <tablename> T
    Inner join (Select EMail from <tablename> Group by Email having count(1) >1) Dup
    On Dup.Email = T.Email
    And T.Name is not null and T.DOB is not null

    try this one


  3. #3
    Join Date
    Dec 2007
    Dear baburajv

    Thank you for your reply. I had tried very similar to the second solution but unfortunately it did not work out exactly the way I want it. Sometimes it's DOB that contains data, sometimes it's another field , and I need to merge all data of all duplicates into 1 new record.

    I had that query running for about 4 hours but it still was busy. (over 250 million rows was just too much too handle).

    I have now used a commercial tool called DataQualityTools to do exactly what I need and it works out great (and lightning fast).

    Thank you very much again for your efforts !

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts