Here is the scenario. There is table where the unique-ness is identified based on a set of columns. The table does not have a serial column.
There are duplicates in the table too. Now, what I want to do is - Select only those records which do not have more than one occurence in the table for the given set of columns. I just simply want to leave the duplicate rows, all of them.
First thing that would come in one's mind would be SELECT DISTINCT but the problem is I want some more columns values apart from those that form the uniqueness rule. How should I go ahead with the solution to this? Please suggest. Thanks for taking a look. Best regards.
You may not have a SERIAL column but don't forget about the unseen-but-always-there ROWID column ..... could be just what you're looking for.
I usually approach the "identify duplicates" problem by comparing a table to itself using table aliases & the ROWID. You have columns that identify the uniqueness of a row (perhaps there's an index based on these values) & other columns that you wish to include in the WHERE criteria so perhaps you can start with / build on this:
Assume that col1, col2, and col3 form the uniqueness
Select unique a.col1, a.col2, a.col3, a.col4, a.col5
from table1 a,
where a.col1 = b.col1 and
a.col2 = b.col2 and
a.col3 = b.col3 and
a.col4 = b.col4 and
a.col5 = b.col5 and
a.ROWID <> b.ROWID
Thanks mjldba. The rowid approach seems to fit in well but I saw the documentation and it stated that rowid is available only for fragmented tables. I don't really understand the details of a fragmented table and a non-fragmented table but does that affect the performance the table related DML statements and other operations? Which one is better from this perspective?
Also, what would be the approach if rowid is not available.. that is the table is an unfragmented one? Regards.