Results 1 to 5 of 5
  1. #1
    Join Date
    Jan 2006
    Posts
    38

    Unanswered: Removing duplicates from a table

    I am stuck with yet another problem.

    Here is the scenario. There is table where the unique-ness is identified based on a set of columns. The table does not have a serial column.

    There are duplicates in the table too. Now, what I want to do is - Select only those records which do not have more than one occurence in the table for the given set of columns. I just simply want to leave the duplicate rows, all of them.

    First thing that would come in one's mind would be SELECT DISTINCT but the problem is I want some more columns values apart from those that form the uniqueness rule. How should I go ahead with the solution to this? Please suggest. Thanks for taking a look. Best regards.

  2. #2
    Join Date
    Dec 2003
    Location
    North America
    Posts
    146
    You may not have a SERIAL column but don't forget about the unseen-but-always-there ROWID column ..... could be just what you're looking for.

    I usually approach the "identify duplicates" problem by comparing a table to itself using table aliases & the ROWID. You have columns that identify the uniqueness of a row (perhaps there's an index based on these values) & other columns that you wish to include in the WHERE criteria so perhaps you can start with / build on this:

    Assume that col1, col2, and col3 form the uniqueness

    Select unique a.col1, a.col2, a.col3, a.col4, a.col5
    from table1 a,
    table1 b
    where a.col1 = b.col1 and
    a.col2 = b.col2 and
    a.col3 = b.col3 and
    a.col4 = b.col4 and
    a.col5 = b.col5 and
    a.ROWID <> b.ROWID
    Last edited by mjldba; 03-28-06 at 11:15.

  3. #3
    Join Date
    Jan 2006
    Posts
    38
    Thanks mjldba. The rowid approach seems to fit in well but I saw the documentation and it stated that rowid is available only for fragmented tables. I don't really understand the details of a fragmented table and a non-fragmented table but does that affect the performance the table related DML statements and other operations? Which one is better from this perspective?

    Also, what would be the approach if rowid is not available.. that is the table is an unfragmented one? Regards.

  4. #4
    Join Date
    May 2004
    Location
    New York
    Posts
    248
    The ROWID is always there, the only issue that you might face is that if your table is fragmented across multiple dbspaces then it could have a duplicate rowid

    Cheers

  5. #5
    Join Date
    Dec 2003
    Location
    North America
    Posts
    146
    Agree with artemka's response. Do you know if you have a fragmented (one table distributed to more than 1 dbspace) or non-fragmented (one table in one dbspace) schema for this table?

    In non-fragmented tables, the ROWID identifies the location of the row within the dbspace.

    In fragmented tables, Informix does not assign a ROWID to each row but you can explicitly create a ROWID column by specifying WITH ROWID when the table is created.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •