Results 1 to 11 of 11
  1. #1
    Join Date
    Apr 2013
    Posts
    2

    Unanswered: Calculating maximum consecutive days.

    Hi Guys,
    I am new to the forum and am learning SQL and would appreciate any help!! I am on DB2 using QMF SQL. I have a four month data set, each ID will have a 1 or 0 assigned to each date in the four month date span. I need to find the maximum consecutive days where ABSENT = 1. In the below sample set, the answer for ID = X1234 would by be 4 (assuming the rest of the dates are zero). I am stuck and cannot find the right programming, again any help will be greatly appreciated!

    DATE ID ABSENT
    2012-01-01 X1234 0
    2012-01-02 X1234 1
    2012-01-03 X1234 0
    2012-01-04 X1234 1
    2012-01-05 X1234 1
    2012-01-06 X1234 1
    2012-01-07 X1234 0
    2012-01-08 X1234 0
    2012-01-09 X1234 1
    2012-01-10 X1234 1
    2012-01-11 X1234 1
    2012-01-12 X1234 1

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    What DB2 version/release and platform OS are you using?

    Here is an example tested on DB2 9.7 for Windows.
    It might not work on z/OS.

    Example 1:
    Code:
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    )
    SELECT id
         , MAX(consecutive_days) AS max_consecutive_days
     FROM  (SELECT id
                 , COUNT(*) AS consecutive_days
             FROM  (SELECT id
                         , DAYS(date_)
                           -
                           ROW_NUMBER()
                              OVER(PARTITION BY id
                                       ORDER BY date_
                                  )
                           AS date_group
                     FROM  sample_set
                     WHERE absent = 1
                   ) q
             GROUP BY
                   id
                 , date_group
          ) r
     GROUP BY
           id
    ;
    ------------------------------------------------------------------------------
    
    ID    MAX_CONSECUTIVE_DAYS
    ----- --------------------
    X1234                    4
    
      1 record(s) selected.

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    If date were consecutive within each id,
    Example 2 might be a solution.

    Your sample looks "date were consecutive".
    But, an issue may be how to guarantee the consecutiveness in more bigger data.

    Example 2: tested on DB2 9.7 for Windows
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    )
    SELECT id
         , MAX(consecutive_days) AS max_consecutive_days
     FROM  (SELECT id
                 , COALESCE(
                      MIN( CASE absent WHEN 0 THEN DAYS(date_) END )
                         OVER(PARTITION BY id
                                  ORDER BY date_
                              ROWS BETWEEN CURRENT ROW
                                       AND UNBOUNDED FOLLOWING
                             )
                    , MAX( DAYS(date_) )
                         OVER(PARTITION BY id
                             ) + 1
                   )
                   - DAYS(date_)
                   AS consecutive_days
             FROM  sample_set
           ) r
     GROUP BY
           id
    ;
    ------------------------------------------------------------------------------
    
    ID    MAX_CONSECUTIVE_DAYS
    ----- --------------------
    X1234                    4
    
      1 record(s) selected.

  4. #4
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    If he is using QMF, he is probably using DB2 z/OS, but obviously need to know what version is being used.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  5. #5
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1

    Looking for clusters of the same value.

    You do not need the student id for this example, so let's drop it.

    CREATE TABLE Roster
    (attendance_date DATE NOT NULL PRIMARY KEY,
    attendance_flg INTEGER NOT NULL);

    INSERT INTO Roster
    VALUES
    ('2012-01-01', 0),
    ('2012-01-02', 1),
    ('2012-01-03', 0),
    ('2012-01-04', 1),
    ('2012-01-05', 1),
    ('2012-01-06', 1),
    ('2012-01-07', 0),
    ('2012-01-08', 0),
    ('2012-01-09', 1),
    ('2012-01-10', 1),
    ('2012-01-11', 1),
    ('2012-01-12', 1);

    This query will give you the start date, end date and size of the clusters of absences. This is worth having by itself, but now you can use it to get the longest cluster.

    SELECT MIN(X.attendance_date) AS cluster_start,
    MAX(X.attendance_date) AS cluster_end,
    SUM(attendance_flg) AS absence_cnt
    FROM (SELECT attendance_date, attendance_flg,
    (ROW_NUMBER () OVER (ORDER BY attendance_date)
    - ROW_NUMBER()
    OVER (PARTITION BY attendance_flg
    ORDER BY attendance_date))
    FROM Roster)
    AS X(attendance_date, attendance_flg, absence_cnt)
    GROUP BY absence_cnt
    HAVING MAX (attendance_flg) = 1;

    ==========================
    2012-01-02 2012-01-03 1
    2012-01-04 2012-01-06 3
    2012-01-07 2012-01-12 4

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Another example to calculate the count of consecutive days.

    Example 3s: A query to calculate each of consecutive days.
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    )
    SELECT id
         , MIN(date_) AS start_date
         , MAX(date_) AS end_date
         , DAYS(MAX(date_)) - DAYS(MIN(date_)) + 1 AS consecutive_days
     FROM  (
            SELECT id
                 , date_
                 , ROW_NUMBER()
                      OVER(ORDER BY lag_absent , date_
                          ) AS group_nbr_by_start_date
                 , ROW_NUMBER()
                      OVER(ORDER BY leadabsent , date_
                          ) AS group_nbr_by_end_date
             FROM  (SELECT id
                         , date_
                         , absent
                         , LAG (absent , 1 , 0)
                              OVER(ORDER BY date_) AS lag_absent
                         , LEAD(absent , 1 , 0)
                              OVER(ORDER BY date_) AS leadabsent
                     FROM  sample_set
                   ) q
             WHERE absent = 1
               AND
               (   lag_absent = 0
                OR leadabsent = 0
               )
           ) r
     GROUP BY
           id
         , MIN(group_nbr_by_start_date , group_nbr_by_end_date)
    ;
    ------------------------------------------------------------------------------
    
    ID    START_DATE END_DATE   CONSECUTIVE_DAYS
    ----- ---------- ---------- ----------------
    X1234 2012-01-02 2012-01-02                1
    X1234 2012-01-04 2012-01-06                3
    X1234 2012-01-09 2012-01-12                4
    
      3 record(s) selected.

    Get maximum consecutive days for each id, using Example 3s, like...
    Code:
    SELECT id
         , MAX(consecutive_days) AS max_consecutive_days
     FROM  (/*
            Example 3s
            */
           )
     GROUP BY
           id
    ;
    Example 3:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    )
    SELECT id
         , MAX(consecutive_days) AS max_consecutive_days
     FROM  (
            SELECT id
                 , MIN(date_) AS start_date
                 , MAX(date_) AS end_date
                 , DAYS(MAX(date_)) - DAYS(MIN(date_)) + 1 AS consecutive_days
             FROM  (
                    SELECT id
                         , date_
                         , ROW_NUMBER()
                              OVER(ORDER BY lag_absent , date_
                                  ) AS group_nbr_by_start_date
                         , ROW_NUMBER()
                              OVER(ORDER BY leadabsent , date_
                                  ) AS group_nbr_by_end_date
                     FROM  (SELECT id
                                 , date_
                                 , absent
                                 , LAG (absent , 1 , 0)
                                      OVER(ORDER BY date_) AS lag_absent
                                 , LEAD(absent , 1 , 0)
                                      OVER(ORDER BY date_) AS leadabsent
                             FROM  sample_set
                           ) q
                     WHERE absent = 1
                       AND
                       (   lag_absent = 0
                        OR leadabsent = 0
                       )
                   ) r
             GROUP BY
                   id
                 , MIN(group_nbr_by_start_date , group_nbr_by_end_date)
           ) s
     GROUP BY
           id
    ;
    ------------------------------------------------------------------------------
    
    ID    MAX_CONSECUTIVE_DAYS
    ----- --------------------
    X1234                    4
    
      1 record(s) selected.
    Last edited by tonkuma; 04-07-13 at 20:40.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by Marcus_A View Post
    If he is using QMF, he is probably using DB2 z/OS, but obviously need to know what version is being used.
    I agree with you.

    Frankly speaking,
    I intentionally published examples of queries which would not work on DB2 for z/OS.
    Because, OP didn't answered to my questions, like...
    What DB2 version/release and platform OS are you using?

    Here is an example tested on DB2 9.7 for Windows.
    It might not work on z/OS.

    Example 1:
    ...
    Last edited by tonkuma; 04-07-13 at 21:01.

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    According to manuals (and Information Center),
    DB2 Version 8 for z/OS doesn't support OLAP specifications.
    DB2 Version 9.1 for z/OS supports RANK, DENSE_RANK, ROW_NUMBER OLAP specifications.
    DB2 10 for z/OS supports aggregate-functions and [ROWS | RANGE window-aggregation-group-clause] with OLAP specifications.

    But, DB2 for z/OS doesn't support LAG, LEAD OLAP specifications, now.


    DB2 9 - DB2 SQL - OLAP specification
    DB2 10 - DB2 SQL - OLAP specification

  9. #9
    Join Date
    Nov 2011
    Posts
    334
    you can try this:
    Code:
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    )
    SELECT   id,
             max(diff)
    FROM     ( SELECT   id,
             days(date_) - 
             days( max(case when absent=0 then date_ end) 
             over ( partition by id order by date_ rows between unbounded preceding and current row )) as diff
    FROM     sample_set )
    GROUP BY id

  10. #10
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    If "the maximum consecutive days where ABSENT = 1" was begin from first date of an id,
    "max(case when absent=0 then date_ end) over(...)" for the first consecutive days of the id would be NULL.

    The id = 'Z0123' is such an example.

    Example 4: fengsun2
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
      ( '2012-01-01' , 'X1234' , 0 )
    , ( '2012-01-02' , 'X1234' , 1 )
    , ( '2012-01-03' , 'X1234' , 0 )
    , ( '2012-01-04' , 'X1234' , 1 )
    , ( '2012-01-05' , 'X1234' , 1 )
    , ( '2012-01-06' , 'X1234' , 1 )
    , ( '2012-01-07' , 'X1234' , 0 )
    , ( '2012-01-08' , 'X1234' , 0 )
    , ( '2012-01-09' , 'X1234' , 1 )
    , ( '2012-01-10' , 'X1234' , 1 )
    , ( '2012-01-11' , 'X1234' , 1 )
    , ( '2012-01-12' , 'X1234' , 1 )
    
    , ( '2012-01-01' , 'Z0123' , 1 )
    , ( '2012-01-02' , 'Z0123' , 1 )
    , ( '2012-01-03' , 'Z0123' , 1 )
    , ( '2012-01-04' , 'Z0123' , 1 )
    , ( '2012-01-05' , 'Z0123' , 1 )
    , ( '2012-01-06' , 'Z0123' , 0 )
    , ( '2012-01-07' , 'Z0123' , 0 )
    , ( '2012-01-08' , 'Z0123' , 0 )
    , ( '2012-01-09' , 'Z0123' , 1 )
    , ( '2012-01-10' , 'Z0123' , 1 )
    , ( '2012-01-11' , 'Z0123' , 1 )
    , ( '2012-01-12' , 'Z0123' , 1 )
    )
    SELECT   id,
             max(diff)
    FROM     ( SELECT   id,
             days(date_) - 
             days( max(case when absent=0 then date_ end) 
             over ( partition by id order by date_ rows between unbounded preceding and current row )) as diff
    FROM     sample_set )
    GROUP BY id
    ;
    ------------------------------------------------------------------------------
    
    ID    2          
    ----- -----------
    X1234           4
    Z0123           4
    
      2 record(s) selected.

    This might be a solution for the issue.
    Example 5:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     sample_set
    ( date_ , id , absent ) AS (
    VALUES
    << snipped: same data as Example 4. >>
    )
    SELECT   id,
             max(diff)
    FROM     ( SELECT   id,
             days(date_) - 
             COALESCE(
                days( max(case when absent=0 then date_ end)
                      over ( partition by id
                                 order by date_
                             rows between unbounded preceding
                                      and current row )
                    )
              , DAYS( MIN(date_)
                      OVER ( partition by id )
                    ) - 1
             ) as diff
    FROM     sample_set )
    GROUP BY id
    ;
    ------------------------------------------------------------------------------
    
    ID    2          
    ----- -----------
    X1234           4
    Z0123           5
    
      2 record(s) selected.
    Last edited by tonkuma; 04-08-13 at 05:28.

  11. #11
    Join Date
    Apr 2013
    Posts
    2
    Thanks guys for such quick replies! I will try them out today.
    I do apologize for not adding that the QMF we have is on version 9 release 1 for windows
    Last edited by Zavwar; 04-08-13 at 17:51.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •