If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Db2 find duplicate record across multiple tables

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 03-26-11, 06:59
sanjay bharkatiya sanjay bharkatiya is offline
Registered User
 
Join Date: Mar 2011
Posts: 4
Red face Db2 find duplicate record across multiple tables

i have 32 tables of each state road say, delhi_road,haryana_road,up_road,mh_road..etc for all states

i have edge_id column in each table which has a unique integer value.
now i want to find duplicate edge_id across these multiple and same table.

on same table i can make edge_id as a primary key ,but how can i check that delhi_road edge_id is not duplicated in haryana_road edge_id or mh_road edge_id or vice versa.

please suggest me a query/strored proc/or any other option to perform this .

Sanjay
Reply With Quote
  #2 (permalink)  
Old 03-26-11, 08:48
Stealth_DBA Stealth_DBA is offline
Registered User
 
Join Date: May 2009
Posts: 472
Here is one way that should work. There are probably others.
Code:
WITH DUP_EDGE (EDGE_ID)
  AS (
      SELECT EDGE_ID FROM TAB01 UNION ALL
      SELECT EDGE_ID FROM TAB01 UNION ALL
      SELECT EDGE_ID FROM TAB01 UNION ALL
      ...
      SELECT EDGE_ID FROM TAB31 UNION ALL
      SELECT EDGE_ID FROM TAB31 
      GROUP BY EDGE_ID
      HAVING COUNT(*) > 1
     )
SELECT 'TAB01' AS TAB_NAME
     , EDGE_ID
FROM TAB01
       INNER JOIN
     DUP_EDGE
       ON TAB01.EDGE_ID = DUP_EDGE.EDGE_ID
UNION ALL
SELECT 'TAB02' AS TAB_NAME
     , EDGE_ID
FROM TAB02
       INNER JOIN
     DUP_EDGE
       ON TAB02.EDGE_ID = DUP_EDGE.EDGE_ID

...
UNION ALL
SELECT 'TAB32' AS TAB_NAME
     , EDGE_ID
FROM TAB32
       INNER JOIN
     DUP_EDGE
       ON TAB32.EDGE_ID = DUP_EDGE.EDGE_ID
The Common Table Expression creates a 'virtual' table of distinct EDGE_ID values that are in the table(s) more than once.

This is used to join to each table to get the rows that have any of these duplicate values. To know which table the row comes from a 'label' TAB_NAM is added to the Select.
Reply With Quote
  #3 (permalink)  
Old 03-26-11, 10:52
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,195
Here is another example.

But, this may not perform well as Stealth_DBA's example,
because it is necessary to join all 32 tables at once and to specify 32*(32-1)/2 = 496 ORed conditions.

Code:
SELECT
       COALESCE(t01.edge_id , t01.edge_id , ... , t31.edge_id) /* t32.edge_id is not necessary */
       AS edge_id
     ,    CASE WHEN t01.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       || CASE WHEN t02.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       ...
       || CASE WHEN t32.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       AS duplicates

 FROM
       delhi_road   t01
 FULL OUTER JOIN
       haryana_road t02
   ON  t02.edge_id = t01.edge_id
 FULL OUTER JOIN
       mh_road      t03
   ON  t03.edge_id = COALESCE(t01.edge_id , t02.edge_id)
...
 FULL OUTER JOIN
       zzzz_road    t32
   ON  t32.edge_id = COALESCE(t01.edge_id , t02.edge_id , ... , t31.edge_id)

 WHERE
/*
number of ORed conditiones are
 32 * (32 - 1) / 2 = 496
*/
       t01.edge_id = t02.edge_id
   OR  t01.edge_id = t03.edge_id
...
   OR  t01.edge_id = t32.edge_id

   OR  t02.edge_id = t03.edge_id
   OR  t02.edge_id = t04.edge_id
...
   OR  t02.edge_id = t32.edge_id

   OR  t03.edge_id = t04.edge_id
   OR  t03.edge_id = t05.edge_id
...
   OR  t03.edge_id = t32.edge_id

...
...

   OR  t30.edge_id = t31.edge_id
   OR  t30.edge_id = t32.edge_id

   OR  t31.edge_id = t32.edge_id

/*
ORDER BY
      edge_id
*/
;
This is a sample output of 4 tables:
Code:
------------------------------ Commands Entered ------------------------------
WITH
  delhi_road  (edge_id) AS (
VALUES 0,1,2,3,4,5,6,7 ,21
)
, haryana_road(edge_id) AS (
VALUES 0,1,2,3,8,9,10,11 ,22
)
, mh_road     (edge_id) AS (
VALUES 0,1,4,5,8,9,12,13 ,23
)
, east_road   (edge_id) AS (
VALUES 0,2,4,6,8,10,12,14 ,24
)
SELECT
       COALESCE(t01.edge_id , t01.edge_id , t02.edge_id , t03.edge_id)
       AS edge_id
     , '   '
       || CASE WHEN t01.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       || CASE WHEN t02.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       || CASE WHEN t03.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       || CASE WHEN t04.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
       AS duplicates

     , SMALLINT(t01.edge_id) AS id01
     , SMALLINT(t02.edge_id) AS id02
     , SMALLINT(t03.edge_id) AS id03
     , SMALLINT(t04.edge_id) AS id04
 FROM
       delhi_road   t01
 FULL OUTER JOIN
       haryana_road t02
   ON  t02.edge_id = t01.edge_id
 FULL OUTER JOIN
       mh_road      t03
   ON  t03.edge_id = COALESCE(t01.edge_id , t02.edge_id)
 FULL OUTER JOIN
       east_road    t04
   ON  t04.edge_id = COALESCE(t01.edge_id , t02.edge_id , t03.edge_id)
 WHERE
       t01.edge_id = t02.edge_id
   OR  t01.edge_id = t03.edge_id
   OR  t01.edge_id = t04.edge_id

   OR  t02.edge_id = t03.edge_id
   OR  t02.edge_id = t04.edge_id

   OR  t03.edge_id = t04.edge_id
 ORDER BY
       edge_id
;
------------------------------------------------------------------------------

EDGE_ID     DUPLICATES ID01   ID02   ID03   ID04  
----------- ---------- ------ ------ ------ ------
          0    YYYY         0      0      0      0
          1    YYY.         1      1      1      -
          2    YY.Y         2      2      -      2
          3    YY..         3      3      -      -
          4    Y.YY         4      -      4      4
          5    Y.Y.         5      -      5      -
          6    Y..Y         6      -      -      6
          8    .YYY         -      8      8      8
          9    .YY.         -      9      9      -
         10    .Y.Y         -     10      -     10
         12    ..YY         -      -     12     12

  11 record(s) selected.
Reply With Quote
Reply

Tags
db2, duplicate ids

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On