Results 1 to 11 of 11
  1. #1
    Join Date
    Jun 2012
    Posts
    18

    Unanswered: Identify duplicate Records Based On Different Status

    Hi,

    I have the data set as below:

    EMP_NAME MGRNAME STATUS MODIFIED_DATE
    Amy John ACTIVE 01/15/2012 00:00:00
    Amy Ken INSERVICE 06/08/2000 00:00:00
    Amy Tom INACTIVE 04/02/2010 00:00:00
    Ron David ACTIVE 01/15/2008 00:00:00
    Keith Jack INACTIVE 08/10/2005 00:00:00
    Keith Cat INACTIVE 04/30/2008 00:00:00
    Keith Ken INACTIVE 02/04/2010 00:00:00
    Mary Stephen INACTIVE 10/18/2010 00:00:00



    Now, i should identify the duplicate rows based on the below conditions:

    If an Emp has 1 Mgr tagged, then we should not consider the same.
    Ex:- Ron, Mary

    If an Emp has been tagged to multiple Managers, then we need to check if he is tagged to any of the manager as ACTIVE , then we should not consider the same. Fetch those records whose status is <> ACTIVE for that Emp.
    Ex:- For Amy, we should exclude the record with ACTIVE status. We should fetch the records with INSERVICE and INACTIVE

    If an Emp has been tagged to multiple Managers, but he is in INACTIVE status with all of them, then leave the max(MODIFIED_DATE) record and fetch the remaining records
    Ex:- For Keith, as both the Mgr records Status is INACTIVE, fetch the (MODIFIED_DATE) records which are 08/10/2005 00:00:00 and 04/30/2008 00:00:00

    The final output should look like below:


    EMP_NAME MGRNAME STATUS MODIFIED_DATE
    Amy Ken INSERVICE 06/08/2000 00:00:00
    Amy Tom INACTIVE 04/02/2010 00:00:00
    Keith Jack INACTIVE 08/10/2005 00:00:00
    Keith Cat INACTIVE 04/30/2008 00:00:00


    Regards,
    Asha

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    (1) If an Emp is tagged to any of the manager as ACTIVE,
    is ACTIVE record one?
    (e.g. someimes a emp might be managed by an organizaional manager and a project manager in big project)

    (2) If an Emp has been tagged to multiple Managers, but he is in INACTIVE and/or INSERVICE status(but, no ACTIVE),
    What to do?

  3. #3
    Join Date
    Jun 2012
    Posts
    18
    1) If an Emp is tagged to any of the manager as ACTIVE,
    is ACTIVE record one?
    (e.g. someimes a emp might be managed by an organizaional manager and a project manager in big project)

    Yes, the situation may happen where in the Emp can be mapped to multiple managers in ACTIVE status . We should not fetch those ACTIVE records in that case.


    (2) If an Emp has been tagged to multiple Managers, but he is in INACTIVE and/or INSERVICE status(but, no ACTIVE),
    What to do?

    If an Emp is not in ACTIVE status with any manager, then out of remaining status (INACTIVE and INSERVICE) we should not fetch the max(modifieddate) record and remaining records need to be fetched.

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Please try

    Example 1:
    Code:
    SELECT emp_name
         , mgrname
         , status
         , modified_date
     FROM (SELECT t.*
                , ROW_NUMBER()
                     OVER( PARTITION BY emp_name
                               ORDER BY CASE status
                                        WHEN 'ACTIVE' THEN
                                             status
                                        END           ASC  NULLS LAST
                                      , modified_date DESC
                         ) AS row_num
            FROM  data_set t
          )
     WHERE row_num >  1
       AND status  <> 'ACTIVE'
     ORDER BY
           emp_name      ASC
         , modified_date ASC
    ;
    It would be better to supply more sample data including additional situations to test my example and other trial queries.
    Last edited by tonkuma; 06-30-12 at 21:20. Reason: Add "ASC"s to clarify the meaning of query.

  5. #5
    Join Date
    Jun 2012
    Posts
    18
    Let me check
    Last edited by ashachandrika; 07-02-12 at 16:29.

  6. #6
    Join Date
    Jun 2012
    Posts
    18
    This is 90% near to which i am expecting.

    But we need to check 1 more scenario where in, there is no modified_Date to a particular employee.

    Ex:-
    EMP_NAME MGRNAME STATUS MODIFIED_DATE
    Bin James InService 06/08/2000 00:00:00
    Charles Chris Disconnect


    In this case, we should not fetch those records. But with the above query we are getting those records also.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    ... But with the above query we are getting those records also.
    Really?
    Did you tried the query on your computer?
    Please publish your results with your test data.

  8. #8
    Join Date
    Jun 2012
    Posts
    18
    sorry for the confusion. Ya its working perfectly fine . Thanks a lot
    tonkuma. But can you explain me, how we have partitioned the same based on the Status. Is it bcz, the ACTIVE status comes first alphabetically or something else.

  9. #9
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Please see this part
    Code:
    ...
                               ORDER BY CASE status
                                        WHEN 'ACTIVE' THEN
                                             status
                                        END           ASC  NULLS LAST
                                      , modified_date DESC
    ...
    It treated other than status 'ACTIVE' were equal.

    If an Emp is not in ACTIVE status with any manager, then out of remaining status (INACTIVE and INSERVICE) we should not fetch the max(modifieddate) record and remaining records need to be fetched.
    Supporse if modfied_date of an INSERVICE was larger than modfied_date of all INACTIVE.
    Could you "out of remaining status, not fetch the max(modifieddate) record and remaining records to be fetched"?

    And,
    you showed other satuses('InService' and 'Disconnect').
    If those(and more others, if there were) statuses should treat equally as out of 'ACTIVE', it might be safe to use the CASE expression.

    Note: I thought that 'InService' would be not same as 'INSERVICE' on mamy computer systems including Oracle.

  10. #10
    Join Date
    Jun 2012
    Posts
    18
    Supporse if modfied_date of an INSERVICE was larger than modfied_date of all INACTIVE.
    Could you "out of remaining status, not fetch the max(modifieddate) record and remaining records to be fetched"?
    Ex:- If i change my dataset as below:
    EMP_NAME MGRNAME STATUS MODIFIED_DATE
    Keith Jack INACTIVE 08/10/2005 00:00:00
    Keith Cat INSERVICE 04/30/2008 00:00:00
    Keith Ken INACTIVE 02/04/2007 00:00:00

    In this case, i should fetch only the 1st and 3rd record. The query which you have provided satisfies the same condition.


    you showed other satuses('InService' and 'Disconnect').
    If those(and more others, if there were) statuses should treat equally as out of 'ACTIVE', it might be safe to use the CASE expression.
    If there is an emp with ACTIVE status then we can consider all the remaining status as equal. If not, then we will go based on the MODIFIED_DATE approach i.e., ignore the MAX(MODIFIED_DATE) record and fetch the remaining records.


    For Inservice and INSERVICE, anyway i am doing upper(STATUS) so there will not be any change due to the same.


    I have a doubt here. You told that we took all the status except ACTIVE as equal.

    In that case, for the emp : Keith, i understand that all the 3 records should be treated with same row_num = 1 . But its wrong. Can you help me to understand?

  11. #11
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I couldn't understand your issues.

    Please try your (thinking) query and my example by yourself on your system.
    And publish test data, queries which you tried, the results of the queries, and your expeced result.

    It would be better to add more test data rather than to modify existing data, like...
    EMP_NAME MGRNAME STATUS MODIFIED_DATE
    KeithB Jack INACTIVE 08/10/2005 00:00:00
    KeithB Cat INSERVICE 04/30/2008 00:00:00
    KeithB Ken INACTIVE 02/04/2007 00:00:00
    Last edited by tonkuma; 07-03-12 at 21:54.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •