Results 1 to 14 of 14
  1. #1
    Join Date
    Jul 2008
    Posts
    94

    Unanswered: Flag the data based on duplicate?

    i have duplicate data in table. where i need to flag

    if Col1 is duplicate and Col2 is 20 or 30 then Flag 'N'

    else if Col1 is duplicate and Col2 is 20 or 30 or 40 then Flag 'Y'

    Col1,Col2
    10,20
    10,30
    20,20
    20,30
    20,40

    Expected Output

    Col1,Col2,Col3

    10,20,N
    10,30,N
    20,20,Y
    20,30,Y
    20,40,Y

  2. #2
    Join Date
    Apr 2006
    Location
    Belgium
    Posts
    2,514
    Provided Answers: 11
    have you already tried some sql or waiting for the others to find a solution for you ??
    have a look at group by - having - count...
    IBM DB2 9.5 Information Center for Linux, UNIX, and Windows
    Best Regards, Guy Przytula
    Database Software Consultant
    Good DBAs are not formed in a week or a month. They are created little by little, day by day. Protracted and patient effort is needed to develop good DBAs.
    Spoon feeding : To treat (another) in a way that discourages independent thought or action, as by overindulgence.
    DB2 UDB LUW Certified V7-V8-V9-V9.7-V10.1-V10.5 DB Admin - Advanced DBA -Dprop..
    Information Server Datastage Certified
    http://www.infocura.be

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    if Col1 is duplicate and Col2 is 20 or 30 then Flag 'N'

    else if Col1 is duplicate and Col2 is 20 or 30 or 40 then Flag 'Y'
    Your requirements also duplicated.

    And you didn't wrote what to do
    if NOT(Col1 is duplicate and Col2 is 20 or 30 or 40)
    in other words...
    if (Col1 is not duplicate) or (Col2 is other than 20 or 30 or 40) then do what?
    Last edited by tonkuma; 08-12-11 at 03:55.

  4. #4
    Join Date
    Jul 2008
    Posts
    94
    Col1,Col2
    10,20
    10,30
    20,20
    20,30
    20,40
    01,20
    02,30
    03,40
    03,50

    Expected Output

    Col1,Col2,Col3

    10,20,N
    10,30,N
    20,20,Y
    20,30,Y
    20,40,Y
    01,20,N
    02,30,N
    03,40,Y
    03,50,Y


    Code:
    if Col1 is duplicate and Col2 is 20 or 30 then Flag 'N' 
    else if Not(Col1 is duplicate) and Col2 is 20  then Flag 'N'
    else if Not(Col1 is duplicate) and Col2 is 30  then Flag 'N'
    else Col1 is duplicate and Not(Col2 is 20 or 30) then Flag 'Y'

  5. #5
    Join Date
    Apr 2006
    Location
    Belgium
    Posts
    2,514
    Provided Answers: 11
    have you already tried some sql or waiting for the others to find a solution for you ??
    I myself I don't like "spoonfeeding" requests...
    Best Regards, Guy Przytula
    Database Software Consultant
    Good DBAs are not formed in a week or a month. They are created little by little, day by day. Protracted and patient effort is needed to develop good DBAs.
    Spoon feeding : To treat (another) in a way that discourages independent thought or action, as by overindulgence.
    DB2 UDB LUW Certified V7-V8-V9-V9.7-V10.1-V10.5 DB Admin - Advanced DBA -Dprop..
    Information Server Datastage Certified
    http://www.infocura.be

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Code:
    if Col1 is duplicate and Col2 is 20 or 30 then Flag 'N' 
    else if Not(Col1 is duplicate) and Col2 is 20  then Flag 'N'
    else if Not(Col1 is duplicate) and Col2 is 30  then Flag 'N'
    else Col1 is duplicate and Not(Col2 is 20 or 30) then Flag 'Y'
    How to flag following cases?
    'Y' or 'N' or other value or null?

    col1, col2, flag
    111, 25, ?
    121, 20, ?
    121, 27, ?
    122, 30, ?
    122, 34, ?
    141, 20, ?
    141, 21, ?
    141, 22, ?
    141, 30, ?

    col1 is not duplicate and col2 is not 20 nor 30.
    col1 is duplicate(2 rows only) and col2 of one row is 20 and col2 of another row is not 20 nor 30.
    col1 is duplicate(2 rows only) and col2 of one row is 30 and col2 of another row is not 20 nor 30.
    Last edited by tonkuma; 08-12-11 at 08:32. Reason: Add rows col1=141

  7. #7
    Join Date
    Jul 2008
    Posts
    94
    col1, col2, flag
    111, 25, Y
    121, 20, Y
    121, 27, Y
    122, 30, Y
    122, 34, Y
    141, 20, Y
    141, 21, Y
    141, 22, Y
    141, 30, Y

    The requirement here is we have to check only for 20 or 30 and col1 is either duplicate or unique then we have to flag as 'N'.

    142,20,N
    143,30,N
    144,20,N
    144,30,N
    145,20,Y
    145,30,Y
    145,40,Y
    146,null,Y
    null,null,Y
    Last edited by laknar; 08-12-11 at 10:51.

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    The requirement here is we have to check only for 20 or 30 and col1 is either duplicate or unique then we have to flag as 'N'.
    I thought this was not enough, because the following case was unclear.
    (Note: it is not easy to know rows with a condition are not exists.)
    if col2 of a row is 20 or 30 and col1 is duplicate and there is another row with same col1 and col2 is not 20 nor 30.

    My understandings of your requirements from examples are...
    Code:
    Conditions of flag 'N'
    1) col1 is not duplicate and col2 is 20 or 30
    2) col1 is duplicate two rows and one of col2 is 20 and another col2 is 30
    in other words:
     all col2 with same col1 are 20 or 30. 
    
    All other rows flag 'Y'

  9. #9
    Join Date
    Jul 2008
    Posts
    94
    yes, exactly.

    Conditions of flag 'N'
    1) col1 is not duplicate and col2 is 20 or 30
    2) col1 is duplicate two rows and one of col2 is 20 and another col2 is 30
    in other words:
    all col2 with same col1 are 20 or 30.

    All other rows flag 'Y'
    i have tried and still working on several queries with Analytical functions like (Row_Number()) and Aggregate functions.


    any help is greatly appreciated.

    thanks.

  10. #10
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    One way may be...
    Code:
    Conditions of flag 'N'
    1) col1 is not duplicate and col2 is 20 or 30
        COUNT(*) OVER(PARTITION BY col1) = 1
    AND col2 IN (20 , 30)
    
    2) col1 is duplicate two rows and one of col2 is 20 and another col2 is 30
        COUNT(*) OVER(PARTITION BY col1) = 2
    AND MIN(col2) OVER(...) ...
    AND MAX(col2) ... ...

  11. #11
    Join Date
    Jul 2008
    Posts
    94
    Thank you tonkuma for you hint.

    Code:
    create table testt(col1 number,col2 varchar(5));
    
    insert into testt values(10,'20');
    insert into testt values(10,'30');
    insert into testt values(20,'20');
    insert into testt values(30,'30');
    insert into testt values(40,'20');
    insert into testt values(40,'30');
    insert into testt values(40,'40');
    insert into testt values(50,'40');
    
    
    select col1,col2,case when (count(*) Over(partition by col1))=1 and (col2='20' or col2='30') Then 'N' 
    when (count(*) Over(partition by col1))=2 and ((Min(col2) Over(partition by col1))='20' or (Max(col2) Over(partition by col1))='30') Then 'N'
    else 'Y' end as FLAG  from testt
    I have developed this query. is this correct?

  12. #12
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I think
    ... ((Min(col2) Over(partition by col1))='20' AND (Max(col2) Over(partition by col1))='30') ...

    For example:
    121, '20', Y
    121, '27', Y

  13. #13
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Example of some other variations:
    Code:
         , CASE
           WHEN COUNT(*)  OVER(PARTITION BY col1) <= 2
            AND MIN(col2) OVER(PARTITION BY col1) IN ('20' , '30')
            AND MAX(col2) OVER(PARTITION BY col1) IN ('20' , '30')
           THEN 'N'
           ELSE 'Y'
           END  AS flag1
    
         , CASE
           WHEN VARCHAR( COUNT(*) OVER(PARTITION BY col1) )
             || MIN(col2) OVER(PARTITION BY col1)
             || MAX(col2) OVER(PARTITION BY col1)
                IN ('12020' , '13030' , '22030')
           THEN 'N'
           ELSE 'Y'
           END  AS flag2
    
         , CASE
           WHEN SUM( CASE
                     WHEN col2 IN ('20' , '30')
                     THEN 0
                     ELSE 1
                     END
                   ) OVER(PARTITION BY col1) = 0
           THEN 'N'
           ELSE 'Y'
           END  AS flag3
    
         , CASE
           WHEN '20:30:'
                LIKE '%' || VARCHAR( LISTAGG( col2 || ':%' )
                                     OVER( PARTITION BY col1 ORDER BY col2
                                           ROWS BETWEEN UNBOUNDED PRECEDING
                                                    AND UNBOUNDED FOLLOWING )
                            , 3999 )
           THEN 'N'
           ELSE 'Y'
           END  AS flag4

  14. #14
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    More examples:
    Code:
         , SUBSTR(
              'NY'
            , SIGN( SUM( CASE
                         WHEN col2 IN ('20' , '30')
                         THEN 0
                         ELSE 1
                         END
                       ) OVER(PARTITION BY col1)
                  ) + 1
            , 1
           )  AS flag5
    
         , CHR( 78 + SIGN( SUM( COALESCE( 1 - LOCATE('20' , col2)
                                            - LOCATE('30' , col2)
                                        , 1
                                        )
                              ) OVER(PARTITION BY col1)
                         ) * 11
              ) AS flag6
    
         , CHR( 78 + SIGN(  COUNT(*) OVER(PARTITION BY col1)
                          - COUNT( NULLIF(  LOCATE('20' , col2)
                                          + LOCATE('30' , col2)
                                          , 0
                                         )
                                 ) OVER(PARTITION BY col1)
                         ) * 11
              ) AS flag7

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •