Results 1 to 11 of 11
  1. #1
    Join Date
    Apr 2010
    Posts
    32

    Unanswered: DB2 Query - find duplicate records with multiple statuses

    hi, need some quick guidance on how to write a query.

    at work we have a table that identifies which products are blocked for certain customers. some items can become unblocked and the record stays in the table, but in "inactive" status (a field that holds its status). there's an issue where instead of updating the actively blocked item to inactive, it inserts a new row with inactive status. now we have some items that are both blocked and unblocked (active and inactive record) at the same time for the same customer.

    i'm trying to pull all active and inactive records for every item across all accounts, but only those items which have both an active and inactive record.

    i'm currently 'selecting' with a 'where exists' but i can only seem to pull the active records and not it's corresponding inactive record(s). i need them all.

    any help is appreciated.

    thanks!
    -dq
    Last edited by DonQuixote; 06-28-12 at 18:31.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'd suggest something like:
    Code:
    SELECT *
       FROM myTable AS a
       WHERE EXISTS (SELECT *
          FROM myTable AS b
          WHERE  b.PrimaryKey = a.PrimaryKey
             AND b.Status <> a.Status)
    This assumes that the PrimaryKey is one or more columns that collectively are at least sufficient to match the rows that you consider "the same, except for status".

    Note that this can get really messy if you have many instances of this kind of problem so you probably want to use a FETCH FIRST clause the first time you run the query!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Apr 2010
    Posts
    32
    thanks for the reply. i used a fetch first indeed! because the table has about 70 million rows.

    the problem with this is the primary keys are different for the duplicate rows. so instead i joined by account# and product number together. this resulted in only the "a" records being pulled and none of the "b". How can i get both?

    i tried wrapping this as a temp table inside another query, where i joined the block table again to this. Despite my inner join, i got a bunch of strange rows, which didn't consist of any from this temp table. I think it might still work, but I'm probably doing something wrong with it.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Rather than the two of us doing a "dance in the abstract", could you post either the DB2 code that you've got so far or alternatively the SELECT statement to return all of the active rows for exactly one account (using constant values to specify the values that interest you). That ought to be enough to give us quite a head start toward helping you!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I agree with Pat.


    By the way,
    I thought DonQuixote might not understand Pat's suggested query.
    Code:
    SELECT *
       FROM myTable AS a
       WHERE EXISTS (SELECT *
          FROM myTable AS b
          WHERE  b.PrimaryKey = a.PrimaryKey
             AND b.Status <> a.Status)
    Quote Originally Posted by DonQuixote View Post
    ... this resulted in only the "a" records being pulled and none of the "b". How can i get both?
    ...
    But, "a" records include both of 'active' and 'inactive' records.

    And, if both of 'active' and 'inactive' existed for a PrimaryKey, both would be returned.
    Because, by the "b.Status <> a.Status" predicate, 'active' in "a" has corresponding 'inactive' in "b" and 'inactive' in "a" has corresponding 'active' in "b".

  6. #6
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    i am not sure but i interpreted
    the problem with this is the primary keys are different for the duplicate rows
    to mean that records that belong together
    based on account number and something else
    could look like this:
    status code a, primary key = 1a1
    status code b, primary key = 1b4

    (sorry, it is friday)
    Dick Brenholtz, Ami in Deutschland

  7. #7
    Join Date
    Apr 2010
    Posts
    32
    thanks tonkuma. now i understand what was meant that both "a" and "b" will get returned; however, i used his query exactly, and it didn't return any rows (i assume due to the primary key matching). I will assume i didn't follow directions; if i didn't, please let me know. i had to change it back to matching by acct# plus item#, but when I did that, the query returned an active record that had no corresponding inactive record, so the rows were not valid.

    Here is a sample table. the b_class isn't necessary to understand, but i included it in case it will help understand the root cause of our issue. when the class changes and someone tries to inactive the block rule, the system doesn't find the existing record and inserts a new record for D, instead of updating the existing record (and class).
    Code:
    b_id (key) || b_account || b_item || b_class || b_active
    12345678   || 123abc    || 12345  || 12      || A
    12345678   || 123abc    || 12345  || 89      || D


    No Rows returned:
    Code:
      SELECT * 
      FROM blocktable br1 
      WHERE EXISTS ( 
        SELECT * 
        FROM blocktable br2 
        WHERE br1.b_id=br2.b_id AND br1.b_active<>br2.b_active
      ) FETCH FIRST 5 ROWS ONLY WITH UR


    Invalid Rows returned (active records without a corresponding inactive record anywhere in the table)
    Code:
      SELECT * 
      FROM blocktable br1 
      WHERE EXISTS ( 
        SELECT * 
        FROM blocktable br2 
        WHERE br1.b_account=br2.b_account AND br1.b_item=br2.b_item AND br1.b_active<>br2.b_active
      ) FETCH FIRST 5 ROWS ONLY WITH UR


    before i came here, my query looked like this, and did return valid rows, but only the active half.

    Code:
      SELECT * 
      FROM blocktable br1 
      WHERE br1.b_active='A' AND EXISTS ( 
        SELECT * 
        FROM blocktable br2 
        WHERE br2.b_active='D' AND br1.b_account=br2.b_account AND br1.b_item=br2.b_item) 
      FETCH FIRST 5 ROWS ONLY WITH UR
    to make sure the the other half wasn't buried after the first 5 rows, i ordered the columns, but that significantly extends the amount of time they need to run.

    Hope this helps understand my steps so far.

  8. #8
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    how about union all with the opposite?
    and then sort the file that you create outside of db2.
    Dick Brenholtz, Ami in Deutschland

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    SELECT * 
       FROM blocktable br1 
       WHERE EXISTS ( 
          SELECT * 
             FROM blocktable br2 
             WHERE  br1.b_account = br2.b_account
                AND br1.b_item    = br2.b_item
                AND br1.b_active <> br2.b_active) 
       FETCH FIRST 5 ROWS ONLY WITH UR
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Apr 2010
    Posts
    32
    I tried both suggestions, dinosaur's and pat'.

    pat, your query looks sound and looks like it should work, but for some reason it also pulled active records that didn't contain a corresponding inactive record anywhere in the table. i'm not sure why.

    dinosaurs, i union all by switching the statuses and it worked great!

    thanks everyone for you help!!!

  11. #11
    Join Date
    Jun 2007
    Location
    germany
    Posts
    155
    glad you have a solution for a bit.
    gives you time to do a quick and dirty query:
    Select *
    from blocktable
    where b_active NOT IN ('A','D')
    fetch first row only

    gotta have at least one,
    else you have a transfer-typo in your copy of Pat's SQL.
    Dick Brenholtz, Ami in Deutschland

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •