Results 1 to 6 of 6
  1. #1
    Join Date
    Feb 2012
    Location
    Sears Tower, Chicago, Illinois
    Posts
    3

    Question Unanswered: locating duplicate records in a table

    I have a table that has duplicate records with the exception of the ID and I am trying to write a while loop that would go through the table, locate all duplicate records based on a field called LASTNAME.
    I am fairly new to T-SQL and found this forum while looking for an answer.
    Can anyone point me in the right direction please?

    Regards
    Richard Krell

  2. #2
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601
    This will give you a list of all of the LASTNAME records that are duplicated.

    Code:
    select  LASTNAME
            count(*) CountOfLASTNAME
    from    YourTable
    group
    by      LASTNAME
    having  count(*)<>1
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  3. #3
    Join Date
    Feb 2012
    Location
    Sears Tower, Chicago, Illinois
    Posts
    3

    Better Explanation

    Thanks for that script Ken but I guess I didn't describe my problem well enough. My Bad!!

    Someone on my team took a copy of a table and merged it into the original duplicating every record in the original table. Then there were many more updates made to the table before this error was discovered. So my problem is that I have the original table with a copy merged and also many new records.

    I am trying to find a way to search the table locating all duplicate records which can be identified by their ID. Then drop those duplicates with the higher ID's. See below for example:

    C:\Users\rkrell>SQLCMD
    1> USE WEBDATA
    2> GO
    Changed database context to 'webdata'.
    1> SELECT ID, LASTNAME, FIRSTNAME, USERKEY
    2> FROM USER_ACCOUNT
    3> WHERE USERKEY='RKRELL'
    4> GO
    ID LASTNAME FIRSTNAME USERKEY
    ----------- -------------------- -------------------- --------------
    637 KRELL RICHARD RKRELL
    1316 KRELL RICHARD RKRELL

    (2 rows affected)

    I do apologize for wasting your time and hope this explains my issue better. If not please let me know.

  4. #4
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601
    Something like this? The LASTNAME, FIRSTNAME and USERKEY may be redundant, but that will be for you to figure-out.

    Don't run this without testing it. First run the inner-most SELECT and see if the results are reasonable. Then run the next-higher SELECT and check those results. Then, run the SELECT with the * and see if those are the records you are expecting to delete. If so, you can then run the whole thing.

    Code:
    delete  YT
    --select  *
    from    YourTable YT
    inner
    join    (
            select	Step1.LASTNAME
                    ,Step1.FIRSTNAME
                    ,Step1.USERKEY
                    max(ID) MaxOfID
            from    (
                    select  LASTNAME
                            ,FIRSTNAME
                            ,USERKEY
                            count(*) CountOfLASTNAME
                    from    YourTable
                    group
                    by      LASTNAME
                            ,FIRSTNAME
                            ,USERKEY
                    having  count(*)<>1
                    ) Step1
            group
            by      Step1.LASTNAME
                    ,Step1.FIRSTNAME
                    ,Step1.USERKEY
                    ) Step2 on
                        YT.ID=Step2.MaxOfID
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  5. #5
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    This is another way for deleting those doubles:
    Code:
    WITH CTE AS
    (SELECT USER_ACCOUNT.ID, 
    	USER_ACCOUNT.LASTNAME,
    	ROW_NUMBER() OVER (PARTITION BY USER_ACCOUNT.LASTNAME ORDER BY USER_ACCOUNT.ID ASC) aS RowNum
    FROM USER_ACCOUNT
    	INNER JOIN (SELECT LASTNAME
    			FROM USER_ACCOUNT
    			GROUP BY LASTNAME
    			HAVING count(*) > 1) AS T ON
    		USER_ACCOUNT.LASTNAME = T.LASTNAME
    )
    DELETE UA
    FROM USER_ACCOUNT AS UA
    	INNER JOIN CTE ON
    		CTE.ID = UA.ID
    WHERE CTE.RowNum > 1
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  6. #6
    Join Date
    Feb 2012
    Location
    Sears Tower, Chicago, Illinois
    Posts
    3

    Talking Resolved

    I want to thank everyone for you assistance in this problem. I used Wim's example and it worked great so thank you Wim.

    Again thanks to both of you.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •