Page 1 of 3 123 LastLast
Results 1 to 15 of 39
  1. #1
    Join Date
    Aug 2009
    Posts
    42

    Unanswered: how to identify duplicate keys

    How to identify duplicate keys when create unique index fails with duplicate key error?

  2. #2
    Join Date
    Jul 2009
    Posts
    150
    select count(*), keys
    from some_table
    group by keys
    having count(*) > 1


    Kara S.

  3. #3
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    select * from <tab_name> where <index_col> in (select <index_col> from <tab_name> group by <index_col> having count(*) > 1)


    For example:


    test@p6db2serv /home/test > db2 "select * from t1"

    C1 C2
    ----------- -----------
    1 1
    1 2
    2 1
    2 2
    3 1
    4 1

    6 record(s) selected.

    test@p6db2serv /home/test > db2 "create unique index i1 on t1 (c1)"
    DB21034E The command was processed as an SQL statement because it was not a
    valid Command Line Processor command. During SQL processing it returned:
    SQL0603N A unique index cannot be created because the table contains data
    that would result in duplicate index entries. SQLSTATE=23515


    test@p6db2serv /home/test > db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1)"

    C1 C2
    ----------- -----------
    1 1
    1 2
    2 1
    2 2

    4 record(s) selected.

  4. #4
    Join Date
    Aug 2009
    Posts
    42
    Will this work if create unique index is on multiple columns?

  5. #5
    Join Date
    Jul 2009
    Posts
    150
    Quote Originally Posted by rdba
    Will this work if create unique index is on multiple columns?
    It will work on any combination of columns in key.

  6. #6
    Join Date
    Aug 2009
    Posts
    42
    This does not work for me if the index consists of multiple columns. Can you please show an example with some data?

  7. #7
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    rdba, Just put the multiple columns in the SELECT and the GROUP BY.
    Code:
    SELECT key-col1, key-col2, key-colx, COUNT(*) AS CNT
    FROM table-name
    GROUP BY key-col1, key-col2, key-colx
    HAVING COUNT(*) > 1

  8. #8
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    Why not try a different approach. Create UK/PK first, then do a load utilizing the exception table which will be populated with dups that were rejected.
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  9. #9
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    I think the issue with load / exception table is that db2 will load the first duplicate record and send the rest of duplicates to the exception table. So, the exception table will contain all duplicates expect for the one db2 was able to load.

  10. #10
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    I modified my example to create a unique index on two columns and display all duplicate records.

    Example:

    test@p6db2serv /home/test > db2 "select * from t1"

    C1 C2 C3
    ----------- ----------- -----------
    1 1 1
    1 1 1
    1 2 2

    3 record(s) selected.

    test@p6db2serv /home/test > db2 "create unique index i1 on t1 (c1,c2)"
    DB21034E The command was processed as an SQL statement because it was not a
    valid Command Line Processor command. During SQL processing it returned:
    SQL0603N A unique index cannot be created because the table contains data
    that would result in duplicate index entries. SQLSTATE=23515


    test@p6db2serv /home/test > db2 "select c1, c2, count(*) as cnt from t1 group by c1, c2 having count (*) > 1"

    C1 C2 CNT
    ----------- ----------- -----------
    1 1 2

    1 record(s) selected.


    count returns 2 rows (correct), but the output contains 1 row (wrong)


    - Modification to my original query:

    test@p6db2serv /home/test > db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1) and c2 in (select c2 from t1 group by c2 having count(*) > 1)"


    C1 C2 C3
    ----------- ----------- -----------
    1 1 1
    1 1 1


    2 record(s) selected.


    There must be a better way to write this query...

  11. #11
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    Isn't it what the objective is? To only load unique records and be able to review the dups?

    Generally speaking you always have your UK created before you start loading data. In this case your dups will be dumped and you wont know what they are if you are not going to have an exception table.
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  12. #12
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    I'm not sure what the objective is... I thought rdba wanted to list all duplicate records when I first read the question.

  13. #13
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    Maybe we are getting lost in understanding exactly what rdba needs. in my experience you create structure first and then load data. I guess it is a the biggest dilema of them all. What came first Chicken or the egg
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  14. #14
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    While we're waiting for rdb's feedback, can someone please help rewrite my last query. This one:

    db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1) and c2 in (select c2 from t1 group by c2 having count(*) > 1)"


    Thanks

  15. #15
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    Bella, My previous post showed how to find the duplicate values in a table:
    Code:
    SELECT key-col1, key-col2, ..., key-colx, COUNT(*) AS CNT
    FROM table-name
    GROUP BY key-col1, key-col2, ..., key-colx
    HAVING COUNT(*) > 1
    The basic query is the same whether you have one or many columns. Just list all the columns you want to check for duplication in the Select list and the Group By.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •