Results 1 to 12 of 12
  1. #1
    Join Date
    Jul 2008
    Posts
    94

    Unanswered: row_number() to generate unique key for each group.

    Hi,
    I need a help to generate a sequence number based on other key attributes.
    i tried writing the query using row_number(), lag(), lead() and rank() functions.
    Any suggestions would be great.

    Code:
    Input:-
    -------
    Key, Key1, Key2
    10, XX, 101
    10, YY, 102
    10, ZZ, 103
    10, XX, 104
    10, JJ, 101
    11, AA, 102
    11, AA, 108
    11, BB, 109
    12, CC, 110
    
    Output:-
    --------
    Key, Key_Seqno, Key1, Key1_seqno, Key2, Key2_seqno
    10,    1               XX,     2                    101,  3  
    10,    1               YY,     4                    102,  5
    10,    1               ZZ,     6                    103,  7
    10,    1               XX,     2                    104,  8
    10,    1               JJ,     9                    101,  3
    11,    1               AA,     2                    102,  3
    11,    1               AA,     2                    108,  4
    11,    1               BB,     5                    109,  6
    12,    1               CC,     2                    110,  3

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Can you describe what you want to accomplish in English as opposed to SQL?

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Sequence of rows are not guaranteed.
    So, you should add a sequence column, like
    Code:
    SEQ         KEY         KEY1 KEY2       
    ----------- ----------- ---- -----------
              1          10 XX           101
              2          10 YY           102
              3          10 ZZ           103
              4          10 XX           104
              5          10 JJ           101
              6          11 AA           102
              7          11 AA           108
              8          11 BB           109
              9          12 CC           110
    For example:
    Without such column, how to know the row
    10, JJ, 101
    was 5th row.

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Though the query example is very complex,
    it returned your expected ouput anyway.

    Test input data:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_input
    ( seq , Key , Key1 , Key2 ) AS (
    VALUES
      ( 1 , 10 , 'XX' , 101 )
    , ( 2 , 10 , 'YY' , 102 )
    , ( 3 , 10 , 'ZZ' , 103 )
    , ( 4 , 10 , 'XX' , 104 )
    , ( 5 , 10 , 'JJ' , 101 )
    , ( 6 , 11 , 'AA' , 102 )
    , ( 7 , 11 , 'AA' , 108 )
    , ( 8 , 11 , 'BB' , 109 )
    , ( 9 , 12 , 'CC' , 110 )
    )
    Example 1:
    Code:
    SELECT key
         , 1   AS key_seqno
         , key1
         , INT( MIN(key1_seqno) OVER(PARTITION BY key , key1) ) AS key1_seqno
         , key2
         , INT( MIN(key2_seqno) OVER(PARTITION BY key , key2) ) AS key2_seqno
     FROM  (SELECT r.*
                 , dr1
                   + COALESCE(
                        MAX(dr2) OVER(PARTITION BY key
                                          ORDER BY seq
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND 1         PRECEDING)
                      , 0)
                   + 1 AS key1_seqno
                 , dr2
                   +    MAX(dr1) OVER(PARTITION BY key
                                          ORDER BY seq
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND CURRENT ROW)
                   + 1 AS key2_seqno
             FROM  (SELECT s.*
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY MIN(seq)
                                                   OVER(PARTITION BY key , key1)
                                  ) AS dr1
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY MIN(seq)
                                                   OVER(PARTITION BY key , key2)
                                  ) AS dr2
                     FROM  sample_input AS s
                   ) AS r
           ) AS q
     ORDER BY
           seq
    ;
    Ouput:
    Code:
    ------------------------------------------------------------------------------
    
    KEY         KEY_SEQNO   KEY1 KEY1_SEQNO  KEY2        KEY2_SEQNO 
    ----------- ----------- ---- ----------- ----------- -----------
             10           1 XX             2         101           3
             10           1 YY             4         102           5
             10           1 ZZ             6         103           7
             10           1 XX             2         104           8
             10           1 JJ             9         101           3
             11           1 AA             2         102           3
             11           1 AA             2         108           4
             11           1 BB             5         109           6
             12           1 CC             2         110           3
    
      9 record(s) selected.

  5. #5
    Join Date
    Jul 2008
    Posts
    94
    Thank you so much for your quick reply.

    I understand that the data is not in order. we don't have seq column in source data but by sorting the source data by key,key1,key2 we can generate this unique column. sorry this was my mistake while i entered the data.
    let me explain the requirement in english.
    within the organization there are multiple departments and under each dept there are several managers and under each managers there are multiple employees. over the period of time, employee might change to different managers. Here i haven't mentioned the period columns(start_date and end_date).

    for each department, i need to generate the seq_no starting from dept, manage, employee and so on..

    so in general, i have to generate the seq_no for only unique key column values otherwise retain the same seq_no.

    Here i have corrected the source data.

    Code:
    Input:-
    -------
    Key, Key1, Key2
    10, XX, 101
    10, YY, 102
    10, ZZ, 103
    10, XX, 104
    10, JJ, 101
    11, AA, 102
    11, AA, 108
    11, BB, 109
    12, CC, 110
    
    Output:-
    --------
    Key, Key_Seqno, Key1, Key1_seqno, Key2, Key2_seqno
    10,    1               JJ,     2                    101,  3
    10,    1               XX,     4                    101,  3  
    10,    1               XX,     4                    104,  5
    10,    1               YY,     6                    102,  7
    10,    1               ZZ,     8                    103,  9
    11,    1               AA,     2                    102,  3
    11,    1               AA,     2                    108,  4
    11,    1               BB,     5                    109,  6
    12,    1               CC,     2                    110,  3
    Thanks again for the query. Appreciate your help. i believe i can customize the query you provided to achieve this result.
    if you found any better solution. please let me know.

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by laknar View Post
    ...
    ...

    Here i have corrected the source data.

    Code:
    Input:-
    -------
    Key, Key1, Key2
    10, XX, 101
    10, YY, 102
    10, ZZ, 103
    10, XX, 104
    10, JJ, 101
    11, AA, 102
    11, AA, 108
    11, BB, 109
    12, CC, 110
    
    Output:-
    --------
    Key, Key_Seqno, Key1, Key1_seqno, Key2, Key2_seqno
    10,    1               JJ,     2                    101,  3
    10,    1               XX,     4                    101,  3  
    10,    1               XX,     4                    104,  5
    10,    1               YY,     6                    102,  7
    10,    1               ZZ,     8                    103,  9
    11,    1               AA,     2                    102,  3
    11,    1               AA,     2                    108,  4
    11,    1               BB,     5                    109,  6
    12,    1               CC,     2                    110,  3
    For the output, replace
    Code:
                     FROM  sample_input AS s
    by
    Code:
                     FROM  (SELECT t.*
                                 , ROW_NUMBER()
                                      OVER(ORDER BY key , key1 , key2) AS seq
                             FROM  sample_input AS t
                           ) AS s

    Followings are the test result.

    Test input data:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_input
    ( Key , Key1 , Key2 ) AS (
    VALUES
      ( 10 , 'XX' , 101 )
    , ( 10 , 'YY' , 102 )
    , ( 10 , 'ZZ' , 103 )
    , ( 10 , 'XX' , 104 )
    , ( 10 , 'JJ' , 101 )
    , ( 11 , 'AA' , 102 )
    , ( 11 , 'AA' , 108 )
    , ( 11 , 'BB' , 109 )
    , ( 12 , 'CC' , 110 )
    )
    Example 2:
    Code:
    SELECT key
         , 1    AS key_seqno
         , key1
         , INT( MIN(key1_seqno) OVER(PARTITION BY key , key1) ) AS key1_seqno
         , key2
         , INT( MIN(key2_seqno) OVER(PARTITION BY key , key2) ) AS key2_seqno
     FROM  (SELECT r.*
                 , dr1
                   + COALESCE(
                        MAX(dr2) OVER(PARTITION BY key
                                          ORDER BY seq
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND 1         PRECEDING)
                      , 0)
                   + 1 AS key1_seqno
                 , dr2
                   +    MAX(dr1) OVER(PARTITION BY key
                                          ORDER BY seq
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND CURRENT ROW)
                   + 1 AS key2_seqno
             FROM  (SELECT s.*
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY MIN(seq)
                                                   OVER(PARTITION BY key , key1)
                                  ) AS dr1
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY MIN(seq)
                                                   OVER(PARTITION BY key , key2)
                                  ) AS dr2
                     FROM  (SELECT t.*
                                 , ROW_NUMBER()
                                      OVER(ORDER BY key , key1 , key2) AS seq
                             FROM  sample_input AS t
                           ) AS s
                   ) AS r
           ) AS q
     ORDER BY
           seq
    ;
    Output:
    Code:
    ------------------------------------------------------------------------------
    
    KEY         KEY_SEQNO   KEY1 KEY1_SEQNO  KEY2        KEY2_SEQNO 
    ----------- ----------- ---- ----------- ----------- -----------
             10           1 JJ             2         101           3
             10           1 XX             4         101           3
             10           1 XX             4         104           5
             10           1 YY             6         102           7
             10           1 ZZ             8         103           9
             11           1 AA             2         102           3
             11           1 AA             2         108           4
             11           1 BB             5         109           6
             12           1 CC             2         110           3
    
      9 record(s) selected.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    A revision of Example 2.

    Example 2a:
    Code:
    SELECT key
         , 1   AS key_seqno
         , key1
         , MIN( INT(key1_seqno) ) OVER(PARTITION BY key , key1) AS key1_seqno
         , key2
         , MIN( INT(key2_seqno) ) OVER(PARTITION BY key , key2) AS key2_seqno
     FROM  (SELECT r.*
                 , dr1
                   + COALESCE(
                        MAX(dr2) OVER(PARTITION BY key
                                          ORDER BY key1 , key2
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND 1         PRECEDING)
                      , 0)
                   + 1 AS key1_seqno
                 , dr2
                   +    MAX(dr1) OVER(PARTITION BY key
                                          ORDER BY key1
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND CURRENT ROW)
                   + 1 AS key2_seqno
             FROM  (SELECT s.*
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY key1
                                  ) AS dr1
                         , DENSE_RANK()
                              OVER(PARTITION BY key
                                       ORDER BY MIN(seq)
                                                   OVER(PARTITION BY key , key2)
                                  ) AS dr2
                     FROM  (SELECT t.*
                                 , ROW_NUMBER()
                                      OVER(ORDER BY key , key1 , key2) AS seq
                             FROM  sample_input AS t
                           ) AS s
                   ) AS r
           ) AS q
     ORDER BY
           seq
    ;

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Reduced nesting level of subqueries,
    by using more nesting of OLAP specifications.

    Edit: Simplify the calculation of key2_seqno and key1_seqno.
    Example 2b:
    Code:
    SELECT key
         , 1    AS key_seqno
         , key1
         , MIN(INT(dr1
                   + COALESCE(
                        MAX(dr2) OVER(PARTITION BY key
                                          ORDER BY key1 /* , key2 */
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND 1         PRECEDING)
                      , 0)
                   + 1
                  )
              ) OVER(PARTITION BY key , key1) AS key1_seqno
         , key2
    /*
         , MIN(INT(dr2
                   +    MAX(dr1) OVER(PARTITION BY key
                                          ORDER BY key1
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND CURRENT ROW        )
                   + 1
                  )
              )
    */
         , MIN( INT(dr2 + dr1 + 1) )
              OVER(PARTITION BY key , key2) AS key2_seqno
     FROM  (SELECT s.*
                 , DENSE_RANK()
                      OVER(PARTITION BY key
                               ORDER BY key1
                          ) AS dr1
                 , DENSE_RANK()
                      OVER(PARTITION BY key
                               ORDER BY
                                     MIN(ROW_NUMBER()
                                            OVER(ORDER BY key , key1 , key2)
                                        ) OVER(PARTITION BY key , key2)
                          ) AS dr2
             FROM  sample_input AS s
           ) AS r
     ORDER BY
           key , key1 , key2
    ;
    Last edited by tonkuma; 09-03-14 at 09:39. Reason: Simplify calcuration of key2_seqno and key1_seqno

  9. #9
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Another way to calculate dr2.
    Reduced nesting level of OLAP specifications(3 to 2).

    Amended source code(Marked Bold/Red).
    Example 3:
    Code:
    SELECT key
         , 1    AS key_seqno
         , key1
         , MIN( INT(1 + dr1
                    + COALESCE(
                         MAX(dr2)
                            OVER(PARTITION BY key
                                     ORDER BY key1
                                 ROWS BETWEEN UNBOUNDED PRECEDING
                                          AND 1 PRECEDING        )
                       , 0)
                   )
              )
              OVER(PARTITION BY key , key1) AS key1_seqno
         , key2
         , MIN( INT(1 + dr2 + dr1) )
              OVER(PARTITION BY key , key2) AS key2_seqno
     FROM  (SELECT s.*
                 , DENSE_RANK()
                      OVER(PARTITION BY key
                               ORDER BY key1
                          ) AS dr1
                 , COUNT(CASE
                         WHEN ROW_NUMBER()
                                 OVER(PARTITION BY key , key2)
                              = 1
                         THEN 0    /* subject to count   */
                         ELSE NULL /* exclude from count */
                         END )
                      OVER(PARTITION BY key
                               ORDER BY key1 , key2
                           ROWS BETWEEN UNBOUNDED PRECEDING
                                    AND CURRENT ROW        )
                   /* + 1 */ AS dr2
             FROM  sample_input AS s
           ) AS r
     ORDER BY
           key , key1 , key2
    ;
    Last edited by tonkuma; 09-08-14 at 21:32. Reason: Amended source code(Marked Bold/Red).

  10. #10
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Reduced number of OLAP specifications(6 to 5).

    Example 4:
    Code:
    SELECT key
         , 1    AS key_seqno
         , key1
         , MIN(dr1 + dr2 - key2_first)
              OVER(PARTITION BY key , key1) AS key1_seqno
         , key2
         , MIN(dr1 + dr2)
              OVER(PARTITION BY key , key2) AS key2_seqno
     FROM  (SELECT s.*
                 , INT(
                   DENSE_RANK()
                      OVER(PARTITION BY key
                               ORDER BY key1)
                   ) + 1 AS dr1
                 , SUM(key2_first)
                      OVER(PARTITION BY key
                               ORDER BY key1 , key2)
                   AS dr2
             FROM  (SELECT t.*
                         , CASE
                           WHEN ROW_NUMBER()
                                   OVER(PARTITION BY key , key2
                                            ORDER BY key1      )
                                = 1
                           THEN 1
                           ELSE 0
                           END  AS key2_first
                     FROM  sample_input AS t
                   ) AS s
           ) AS r
     ORDER BY
           key , key1 , key2
    ;
    Last edited by tonkuma; 09-07-14 at 12:56. Reason: Removed "ROWS ..." from expression for dr2.

  11. #11
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Rather procedural way.
    Though it may be longer than set oriened ways(e.g. Example 4),
    it might be easier to follow/understand the logic.

    Example 5:
    Code:
    WITH
     with_rn AS (
    SELECT t.*
         , ROW_NUMBER()
              OVER(PARTITION BY key
                       ORDER BY key1 , key2) AS rn
     FROM  sample_input AS t
    )
    , rcte
    ( key , key1 , key2 , rn , seqno
    , key1_seqno , key2_seqno
    , key2_numbered_list
    ) AS (
    SELECT key , '  ' , 0 , 1 , 1
         , 0 , 0
         , VARCHAR('', 4000)
     FROM  with_rn
     WHERE rn = 1
    UNION ALL
    SELECT key , key1 , key2
         , rn + 1
         , MAX(seqno , key1_seqno , key2_seqno) AS seqno
         , key1_seqno , key2_seqno
         , key2_numbered_list
           || CASE
              WHEN key2_seqno > seqno THEN
                   DIGITS(key2) || ':' || DIGITS(key2_seqno)
              ELSE ''
              END  /* (new) key2_numbered_list */
     FROM  (SELECT key , key1 , key2
                 , rn , seqno , key1_seqno
                 , COALESCE(
                      INT(
                         SUBSTR(
                            key2_numbered_list
                          , NULLIF(
                               LOCATE(DIGITS(key2) || ':' , key2_numbered_list)
                             , 0
                            ) + 11
                          , 10
                         )
                      )
                    , MAX(seqno , key1_seqno) + 1
                   ) AS key2_seqno
                 , key2_numbered_list
             FROM  (SELECT r.key , s.key1 , s.key2
                         , r.rn , r.seqno
                         , CASE
                           WHEN s.key1 > r.key1 THEN
                                seqno + 1
                           ELSE key1_seqno
                           END
                             AS key1_seqno
                         , key2_numbered_list
                     FROM  rcte    AS r
                         , with_rn AS s
                     WHERE s.key = r.key
                      AND  s.rn  = r.rn
                      AND  r.rn  < 1000
                   )
           )
    )
    SELECT key
         , key1 , key1_seqno
         , key2 , key2_seqno
     FROM  rcte
     WHERE rn > 1
     ORDER BY
           key , rn
    ;

  12. #12
    Join Date
    Jul 2008
    Posts
    94
    Thanks a ton. My organization maintains 2 employee database but both are identical in structure. However, each database have different employee details. but the problem is the other database is in oracle and it fails in oracle(even after replacing the oracle compatible functions). i'm trying to customize the queries in oracle to achieve the same.

    Thanks again for your help.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •