If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > how to identify duplicate keys

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-07-09, 18:37
rdba rdba is offline
Registered User
 
Join Date: Aug 2009
Posts: 30
how to identify duplicate keys

How to identify duplicate keys when create unique index fails with duplicate key error?
Reply With Quote
  #2 (permalink)  
Old 09-07-09, 19:21
DB2Plus DB2Plus is offline
Registered User
 
Join Date: Jul 2009
Posts: 150
select count(*), keys
from some_table
group by keys
having count(*) > 1


Kara S.
Reply With Quote
  #3 (permalink)  
Old 09-07-09, 19:27
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
select * from <tab_name> where <index_col> in (select <index_col> from <tab_name> group by <index_col> having count(*) > 1)


For example:


test@p6db2serv /home/test > db2 "select * from t1"

C1 C2
----------- -----------
1 1
1 2
2 1
2 2
3 1
4 1

6 record(s) selected.

test@p6db2serv /home/test > db2 "create unique index i1 on t1 (c1)"
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0603N A unique index cannot be created because the table contains data
that would result in duplicate index entries. SQLSTATE=23515


test@p6db2serv /home/test > db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1)"

C1 C2
----------- -----------
1 1
1 2
2 1
2 2

4 record(s) selected.
Reply With Quote
  #4 (permalink)  
Old 09-07-09, 20:42
rdba rdba is offline
Registered User
 
Join Date: Aug 2009
Posts: 30
Will this work if create unique index is on multiple columns?
Reply With Quote
  #5 (permalink)  
Old 09-07-09, 21:38
DB2Plus DB2Plus is offline
Registered User
 
Join Date: Jul 2009
Posts: 150
Quote:
Originally Posted by rdba
Will this work if create unique index is on multiple columns?
It will work on any combination of columns in key.
Reply With Quote
  #6 (permalink)  
Old 09-07-09, 22:12
rdba rdba is offline
Registered User
 
Join Date: Aug 2009
Posts: 30
This does not work for me if the index consists of multiple columns. Can you please show an example with some data?
Reply With Quote
  #7 (permalink)  
Old 09-07-09, 22:31
Stealth_DBA Stealth_DBA is offline
Registered User
 
Join Date: May 2009
Posts: 472
rdba, Just put the multiple columns in the SELECT and the GROUP BY.
Code:
SELECT key-col1, key-col2, key-colx, COUNT(*) AS CNT
FROM table-name
GROUP BY key-col1, key-col2, key-colx
HAVING COUNT(*) > 1
Reply With Quote
  #8 (permalink)  
Old 09-08-09, 09:57
Cougar8000 Cougar8000 is offline
Registered User
 
Join Date: Nov 2005
Location: IL
Posts: 554
Why not try a different approach. Create UK/PK first, then do a load utilizing the exception table which will be populated with dups that were rejected.
__________________
--
IBM Certified DBA on DB2 for Linux, UNIX, and Windows

DB2 v9.1.0.2 os 5.3.0.0
Reply With Quote
  #9 (permalink)  
Old 09-08-09, 14:47
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
I think the issue with load / exception table is that db2 will load the first duplicate record and send the rest of duplicates to the exception table. So, the exception table will contain all duplicates expect for the one db2 was able to load.
Reply With Quote
  #10 (permalink)  
Old 09-08-09, 14:56
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
I modified my example to create a unique index on two columns and display all duplicate records.

Example:

test@p6db2serv /home/test > db2 "select * from t1"

C1 C2 C3
----------- ----------- -----------
1 1 1
1 1 1
1 2 2

3 record(s) selected.

test@p6db2serv /home/test > db2 "create unique index i1 on t1 (c1,c2)"
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0603N A unique index cannot be created because the table contains data
that would result in duplicate index entries. SQLSTATE=23515


test@p6db2serv /home/test > db2 "select c1, c2, count(*) as cnt from t1 group by c1, c2 having count (*) > 1"

C1 C2 CNT
----------- ----------- -----------
1 1 2

1 record(s) selected.


count returns 2 rows (correct), but the output contains 1 row (wrong)


- Modification to my original query:

test@p6db2serv /home/test > db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1) and c2 in (select c2 from t1 group by c2 having count(*) > 1)"


C1 C2 C3
----------- ----------- -----------
1 1 1
1 1 1


2 record(s) selected.


There must be a better way to write this query...
Reply With Quote
  #11 (permalink)  
Old 09-08-09, 15:02
Cougar8000 Cougar8000 is offline
Registered User
 
Join Date: Nov 2005
Location: IL
Posts: 554
Isn't it what the objective is? To only load unique records and be able to review the dups?

Generally speaking you always have your UK created before you start loading data. In this case your dups will be dumped and you wont know what they are if you are not going to have an exception table.
__________________
--
IBM Certified DBA on DB2 for Linux, UNIX, and Windows

DB2 v9.1.0.2 os 5.3.0.0
Reply With Quote
  #12 (permalink)  
Old 09-08-09, 15:06
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
I'm not sure what the objective is... I thought rdba wanted to list all duplicate records when I first read the question.
Reply With Quote
  #13 (permalink)  
Old 09-08-09, 15:12
Cougar8000 Cougar8000 is offline
Registered User
 
Join Date: Nov 2005
Location: IL
Posts: 554
Maybe we are getting lost in understanding exactly what rdba needs. in my experience you create structure first and then load data. I guess it is a the biggest dilema of them all. What came first Chicken or the egg
__________________
--
IBM Certified DBA on DB2 for Linux, UNIX, and Windows

DB2 v9.1.0.2 os 5.3.0.0
Reply With Quote
  #14 (permalink)  
Old 09-08-09, 15:22
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
While we're waiting for rdb's feedback, can someone please help rewrite my last query. This one:

db2 "select * from t1 where c1 in (select c1 from t1 group by c1 having count(*) > 1) and c2 in (select c2 from t1 group by c2 having count(*) > 1)"


Thanks
Reply With Quote
  #15 (permalink)  
Old 09-08-09, 15:32
Stealth_DBA Stealth_DBA is offline
Registered User
 
Join Date: May 2009
Posts: 472
Bella, My previous post showed how to find the duplicate values in a table:
Code:
SELECT key-col1, key-col2, ..., key-colx, COUNT(*) AS CNT
FROM table-name
GROUP BY key-col1, key-col2, ..., key-colx
HAVING COUNT(*) > 1
The basic query is the same whether you have one or many columns. Just list all the columns you want to check for duplication in the Select list and the Group By.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On