Page 1 of 2 12 LastLast
Results 1 to 15 of 23

Thread: duplicate rows

  1. #1
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401

    Unanswered: duplicate rows

    What is the best way to remove duplicate rows from a table ?

  2. #2
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Wow talk about cross posts...

    SELECT DISTINCT Col1, Col2 INTO myTable FROM myOtherTable?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  3. #3
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401
    Code:
    SELECT 
    		Seibel_Extract.Account, 
    		Seibel_Extract.Description, 
    		Seibel_Extract.[Alliance Bookings], 
    		Seibel_Extract.[Concatenated Unique ID], 
    		Seibel_Extract.[Sub Region], 
    		Seibel_Extract.Region, 
    		Seibel_Extract.[Update Method], 
    		Seibel_Extract.[Solution Offering], 
    		Seibel_Extract.Type, 
    		Seibel_Extract.[Operate Deal], 
    		Seibel_Extract.[% of Deal], 
    		Seibel_Extract.[Total Bookings], 
    		Seibel_Extract.[Converted Bookings (EURO)], 
    		Seibel_Extract.Alliance
    	FROM 
    		Seibel_Extract
    	WHERE 
    		Seibel_Extract.Account 
    		In 
    			(
    				SELECT 
    					distinct [Account]
    				FROM 
    					[Seibel_Extract] As Tmp 
    				WHERE
    					[Description] = [Seibel_Extract].[Description] 
    					And [Alliance Bookings] = [Seibel_Extract].[Alliance Bookings]
    				GROUP BY 
    					[Account],
    					[Description],
    					[Alliance Bookings] 
    				HAVING 
    					Count(*)>1  
    			)
    This is the query that returns the duplicate records , I need to keep only one out of the duplicate records

  4. #4
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    What do you mean by duplicate? Are all of the columns the same?

    Then add SELECT DISTINCT

    But they're not, are they.

    So what do you mean by duplicate?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    What is the logic you want to apply to decide which one of each set of duplicate records to keep? If the underlying rows of data are truly exact duplicates, you will need to run select distinct into a new table.

    blindman

  6. #6
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Originally posted by blindman
    What is the logic you want to apply to decide which one of each set of duplicate records to keep? If the underlying rows of data are truly exact duplicates, you will need to run select distinct into a new table.

    blindman
    You know it's going to be more than 1 account number equals a dup.

    You have to make a decision..

    Rows are rows, and the totality of the columns are probably representative of an account.

    The columns "relate" to each just like they are brothers and sisters...the other rows are like cousins..

    You can't just mix and match the family togetehr...

    WHT do you have many rows? Sounds like your addressing the problematic aftermath and not the cause....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  7. #7
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401
    The PK is Acc Name, Description and amount only
    Any two rows with these 3 as exact will be treated as dupes

  8. #8
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401
    WHT do you have many rows? Sounds like your addressing the problematic aftermath and not the cause....
    Yup you are somewhat right . I get some data every week from seibel and i have to clean the data and upload it into sql server. I cant do anything about the data. All I can do is a patch job

  9. #9
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    So the data is historical by nature...is there a last updated date?

    Can you add a column to the base table that has a default GetDate()...

    Then Just select the row where the date=Max(date)
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  10. #10
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401
    No .. i dont have a last updated date ...
    had there been something like that my work would be a lot easier ... I have only the fact that Any two rows with Acc Name, Description and amount as exact will be treated as dupes

  11. #11
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    What's the DDL for Seibel_Extract


    No date..and identity perhaps?

    How do you populate the table?

    (you can tell I'm running out of ideas....)
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  12. #12
    Join Date
    Jul 2003
    Location
    The Dark Planet
    Posts
    1,401
    I get an excel file which i upload through a dts into a table. No date column to go on . Identity column perhaps i can create and then drop after removing the dupes.

    (you can tell I'm running out of ideas....)

    I already have !!!!!!!

  13. #13
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    So you're not appending the data, you do a full refresh everytime?

    Does it matter to you which row you use?

    If it doesn't....

    [reduced to using a cursor...oh the humanity]

    How about SELECT TOP 1 in a cursor

    [/reduced to using a cursor...oh the humanity]

    And insert the rows to a work table....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  14. #14
    Join Date
    Nov 2003
    Location
    Christchurch, New Zealand
    Posts
    1,618
    to get rid of the duplicate rows I would add a couple of temporary extra columns to the table. an indentity column to uniquely indentify each row and a colukmn to indicate what rows you want to keep.

    You should then be able to run a query to flag the ones you want to keep and thne another to delte those not flagged.

    After that remove the temp columns.

    As for the stopping of future data corrupting your new fresh table...

    Cahnge your dts package so the when it inserts a record it checks to see if the record already exists, if it exists the skip the row, otherwise insert it.

    If you need more help with this let me know and I will help some more.

  15. #15
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Originally posted by rokslide
    to get rid of the duplicate rows I would add a couple of temporary extra columns to the table. an indentity column to uniquely indentify each row and a colukmn to indicate what rows you want to keep.

    You should then be able to run a query to flag the ones you want to keep and thne another to delte those not flagged.

    After that remove the temp columns.

    As for the stopping of future data corrupting your new fresh table...

    Cahnge your dts package so the when it inserts a record it checks to see if the record already exists, if it exists the skip the row, otherwise insert it.

    If you need more help with this let me know and I will help some more.
    Just curious...what algorithym would you suggest he use to pick a row?

    Max(Identity)?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •