Results 1 to 3 of 3
  1. #1
    Join Date
    Mar 2011
    Posts
    4

    Red face Unanswered: Db2 find duplicate record across multiple tables

    i have 32 tables of each state road say, delhi_road,haryana_road,up_road,mh_road..etc for all states

    i have edge_id column in each table which has a unique integer value.
    now i want to find duplicate edge_id across these multiple and same table.

    on same table i can make edge_id as a primary key ,but how can i check that delhi_road edge_id is not duplicated in haryana_road edge_id or mh_road edge_id or vice versa.

    please suggest me a query/strored proc/or any other option to perform this .

    Sanjay

  2. #2
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    Here is one way that should work. There are probably others.
    Code:
    WITH DUP_EDGE (EDGE_ID)
      AS (
          SELECT EDGE_ID FROM TAB01 UNION ALL
          SELECT EDGE_ID FROM TAB01 UNION ALL
          SELECT EDGE_ID FROM TAB01 UNION ALL
          ...
          SELECT EDGE_ID FROM TAB31 UNION ALL
          SELECT EDGE_ID FROM TAB31 
          GROUP BY EDGE_ID
          HAVING COUNT(*) > 1
         )
    SELECT 'TAB01' AS TAB_NAME
         , EDGE_ID
    FROM TAB01
           INNER JOIN
         DUP_EDGE
           ON TAB01.EDGE_ID = DUP_EDGE.EDGE_ID
    UNION ALL
    SELECT 'TAB02' AS TAB_NAME
         , EDGE_ID
    FROM TAB02
           INNER JOIN
         DUP_EDGE
           ON TAB02.EDGE_ID = DUP_EDGE.EDGE_ID
    
    ...
    UNION ALL
    SELECT 'TAB32' AS TAB_NAME
         , EDGE_ID
    FROM TAB32
           INNER JOIN
         DUP_EDGE
           ON TAB32.EDGE_ID = DUP_EDGE.EDGE_ID
    The Common Table Expression creates a 'virtual' table of distinct EDGE_ID values that are in the table(s) more than once.

    This is used to join to each table to get the rows that have any of these duplicate values. To know which table the row comes from a 'label' TAB_NAM is added to the Select.

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Here is another example.

    But, this may not perform well as Stealth_DBA's example,
    because it is necessary to join all 32 tables at once and to specify 32*(32-1)/2 = 496 ORed conditions.

    Code:
    SELECT
           COALESCE(t01.edge_id , t01.edge_id , ... , t31.edge_id) /* t32.edge_id is not necessary */
           AS edge_id
         ,    CASE WHEN t01.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           || CASE WHEN t02.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           ...
           || CASE WHEN t32.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           AS duplicates
    
     FROM
           delhi_road   t01
     FULL OUTER JOIN
           haryana_road t02
       ON  t02.edge_id = t01.edge_id
     FULL OUTER JOIN
           mh_road      t03
       ON  t03.edge_id = COALESCE(t01.edge_id , t02.edge_id)
    ...
     FULL OUTER JOIN
           zzzz_road    t32
       ON  t32.edge_id = COALESCE(t01.edge_id , t02.edge_id , ... , t31.edge_id)
    
     WHERE
    /*
    number of ORed conditiones are
     32 * (32 - 1) / 2 = 496
    */
           t01.edge_id = t02.edge_id
       OR  t01.edge_id = t03.edge_id
    ...
       OR  t01.edge_id = t32.edge_id
    
       OR  t02.edge_id = t03.edge_id
       OR  t02.edge_id = t04.edge_id
    ...
       OR  t02.edge_id = t32.edge_id
    
       OR  t03.edge_id = t04.edge_id
       OR  t03.edge_id = t05.edge_id
    ...
       OR  t03.edge_id = t32.edge_id
    
    ...
    ...
    
       OR  t30.edge_id = t31.edge_id
       OR  t30.edge_id = t32.edge_id
    
       OR  t31.edge_id = t32.edge_id
    
    /*
    ORDER BY
          edge_id
    */
    ;
    This is a sample output of 4 tables:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
      delhi_road  (edge_id) AS (
    VALUES 0,1,2,3,4,5,6,7 ,21
    )
    , haryana_road(edge_id) AS (
    VALUES 0,1,2,3,8,9,10,11 ,22
    )
    , mh_road     (edge_id) AS (
    VALUES 0,1,4,5,8,9,12,13 ,23
    )
    , east_road   (edge_id) AS (
    VALUES 0,2,4,6,8,10,12,14 ,24
    )
    SELECT
           COALESCE(t01.edge_id , t01.edge_id , t02.edge_id , t03.edge_id)
           AS edge_id
         , '   '
           || CASE WHEN t01.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           || CASE WHEN t02.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           || CASE WHEN t03.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           || CASE WHEN t04.edge_id IS NOT NULL THEN 'Y' ELSE '.' END
           AS duplicates
    
         , SMALLINT(t01.edge_id) AS id01
         , SMALLINT(t02.edge_id) AS id02
         , SMALLINT(t03.edge_id) AS id03
         , SMALLINT(t04.edge_id) AS id04
     FROM
           delhi_road   t01
     FULL OUTER JOIN
           haryana_road t02
       ON  t02.edge_id = t01.edge_id
     FULL OUTER JOIN
           mh_road      t03
       ON  t03.edge_id = COALESCE(t01.edge_id , t02.edge_id)
     FULL OUTER JOIN
           east_road    t04
       ON  t04.edge_id = COALESCE(t01.edge_id , t02.edge_id , t03.edge_id)
     WHERE
           t01.edge_id = t02.edge_id
       OR  t01.edge_id = t03.edge_id
       OR  t01.edge_id = t04.edge_id
    
       OR  t02.edge_id = t03.edge_id
       OR  t02.edge_id = t04.edge_id
    
       OR  t03.edge_id = t04.edge_id
     ORDER BY
           edge_id
    ;
    ------------------------------------------------------------------------------
    
    EDGE_ID     DUPLICATES ID01   ID02   ID03   ID04  
    ----------- ---------- ------ ------ ------ ------
              0    YYYY         0      0      0      0
              1    YYY.         1      1      1      -
              2    YY.Y         2      2      -      2
              3    YY..         3      3      -      -
              4    Y.YY         4      -      4      4
              5    Y.Y.         5      -      5      -
              6    Y..Y         6      -      -      6
              8    .YYY         -      8      8      8
              9    .YY.         -      9      9      -
             10    .Y.Y         -     10      -     10
             12    ..YY         -      -     12     12
    
      11 record(s) selected.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •