| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

02-05-12, 12:17
|
|
Registered User
|
|
Join Date: Feb 2012
Posts: 7
|
|
|
Delete Duplicate Records-without primary key
|
|
Hi All ,
I want to delete the duplicate record from a table keeping 1 record aside.
my base table is-info
id name class
2 abc 6a
3 abc 6a
4 abc 6a
1 abc 6a
2 abc 6a
4 abc 6a
4 abc 6a
3 abc 6a
3 abc 6a
1 abc 6a
2 abc 6a
5 abc 6a
id-int
name-text
class-varchar
(there is no primary key in this table)
Now i want the result in following way:
id name class
2 abc 6a
3 abc 6a
4 abc 6a
1 abc 6a
I have tried the following query and its running fine but its not a dynamic stuff.
DELETE top (SELECT COUNT(*)-1 FROM aaa WHERE id ='3') --or put some number
FROM aaa
WHERE id ='3'-- or put some number
So i was wondering if some one could help me in this.
Many thanks,
Preetpal kapoor
|
|

02-05-12, 12:38
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 2,407
|
|
If you have enough disk space and then you could do the following:
Code:
SELECT DISTINCT id, name, class
into #temp
FROM aaa;
TRUNCATE TABLE aaa;
INSERT INTO aaa (id, name, class)
SELECT id, name, class
FROM #temp;
COMMIT;
|
|

02-05-12, 12:47
|
|
Registered User
|
|
Join Date: Feb 2012
Posts: 7
|
|
|
|
Sorry I cant make any temp table.
I know this method but this should be done through a single query.
|
|

02-05-12, 12:57
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 2,407
|
|
Quote:
Originally Posted by preetpalkpaoor
Sorry I cant make any temp table.
|
Why is that so?
|
|

02-05-12, 13:07
|
|
Registered User
|
|
Join Date: Feb 2012
Posts: 7
|
|
Because i have to optimize the query execution time.
And by creating a temp table and then deleting it will slow down the processing of a query.
This query will be used on a real time system and if it hampers the performance then it is of no use.
|
|

02-05-12, 13:17
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 2,407
|
|
Quote:
Originally Posted by preetpalkpaoor
Because i have to optimize the query execution time.
And by creating a temp table and then deleting it will slow down the processing of a query.
This query will be used on a real time system and if it hampers the performance then it is of no use.
|
So did you test if if the execution time was too slow?
And if this "hampers" the performance?
|
|

02-05-12, 13:40
|
|
Registered User
|
|
Join Date: Feb 2012
Posts: 7
|
|
At present the table which i have provided you in my query is a fake one (or just for practice) and with this much data it will not hamper .
But on my real time data it is hampering because i have tried this earlier and sued by my senior.
 
|
|

02-05-12, 18:31
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,609
|
|
With no PK, no index, and billions of rows there is no answer. That schema design is so careless that it makes a solution impossible.
There is so much that you haven't told us about the business rules that affect your table that we can only play guessing games. You really need to hire a professional consultant to help you resolve this issue or you need to work on giving us a better description of what you want.
What is the time window that you can work within? What kind of hardware are you using? How many rows are there in the table you are trying to de-dupe? How many of those rows are duplicates? What are the business rules that govern this table? What kind of processing environment does this server work within? What other things might we need to know? What have you tried, and why weren't your attempts acceptable?
Give us the information that we need to solve your problem for you, you might be surprised at what we can do!
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|

02-06-12, 00:12
|
|
Registered User
|
|
Join Date: Feb 2012
Location: DUBAI
Posts: 5
|
|
|
Deleting Duplicate Records From MSSQL
-- Change the #Duplicate_Records to your own table name
-- Change the field as your own tables field
-- Take backup of the original table and test before applying on live
select * from #Duplicate_Records
--Show Duplicate Records in Duplicate_Records
SELECT id, COUNT(*) FROM #Duplicate_Records
GROUP BY id HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
SET ROWCOUNT 1
SELECT @@rowcount
WHILE @@rowcount > 0
DELETE DupRec FROM #Duplicate_Records as DupRec
INNER JOIN
(SELECT id FROM #Duplicate_Records GROUP BY id HAVING count(*) > 1)
AS c ON c.id = DupRec.id
SET ROWCOUNT 0
SELECT * FROM #Duplicate_Records
-- akhilesh
|
|

02-06-12, 02:26
|
|
Registered User
|
|
Join Date: Dec 2006
Posts: 24
|
|
do a select distinct or reduce doubletts by grouping correctly
with the result do a select into on a new table beeing identic plus a primary key ( int autoincrement or whatever)
rename the old table to garbage_without keye
rename the new table to the name of the old one.
...and never ever create a new table without an identity column 
__________________
an Apple a day keeps Dr. Watson away !
|
|

02-06-12, 06:34
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
Originally Posted by wegus
...and never ever create a new table without an identity column 
|
you probably meant to say without a primary key
having an identitly column is no guarantee that "duplicates" won't be entered
(of course, they aren't true duplicates, they'ss all have different identity column values, but the rest of the columns can easily be duplicated)
|
|

02-06-12, 07:06
|
|
Registered User
|
|
Join Date: Dec 2006
Posts: 24
|
|
Quote:
|
Originally Posted by r937
having an identitly column is no guarantee that "duplicates" won't be entered
|
in some cases duplicates might be possible in those cases it is good to have an identity column. In other cases you might not want to save duplicates, then you ougt to ad a primary key. Of course the latter is what mostly happens, but the first case might also occur. So i did mean "at least identities" or even better primary keys. That is what i intendet to say. 
__________________
an Apple a day keeps Dr. Watson away !
|
|

02-06-12, 07:37
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 2,407
|
|
Quote:
Originally Posted by preetpalkpaoor
At present the table which i have provided you in my query is a fake one (or just for practice) and with this much data it will not hamper .
But on my real time data it is hampering because i have tried this earlier and sued by my senior
|
I don't believe that any statement that is able to delete those duplicates without using a temp table will be more efficient (= will hamper the server less) than the solution with a temp table
|
|

02-06-12, 12:22
|
|
Registered User
|
|
Join Date: Feb 2012
Posts: 7
|
|
Thanks every one for your support.
i got 3 methods to do that.
1. To make temp table with i will check with my real time DB and is the easiest method.
2. With CTE. Actually i dnt have any knowledge about CTE.
3.It was my way but a lengthy one.
DELETE top (SELECT COUNT(*)-1 FROM aaa WHERE id ='3') --or put some number
FROM aaa
WHERE id ='3'
here i need to always change the ID.

|
|

02-09-12, 01:44
|
|
Registered User
|
|
Join Date: Jan 2012
Posts: 67
|
|
Quote:
Originally Posted by preetpalkpaoor
2. With CTE. Actually i dnt have any knowledge about CTE.
|
Try this (works on 2008 R2):
Code:
with aaa1 as (
select
row_number() over
(partition by id, name, class order by id) rn,
aaa.*
from aaa
)
delete from aaa1
where rn >1;
But I bet that you will be "sued by your manager" again with this query
- this query requires a sort of the whole table, without proper indexes
it wont perform very well. The method with a temp table is the fastests.
|
Last edited by kordirko; 02-09-12 at 02:42.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|