If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > ANSI SQL > Moving Window Query - Trailing Days in SQL

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-24-11, 22:21
agg.jjk agg.jjk is offline
Registered User
 
Join Date: Dec 2011
Posts: 3
Red face Moving Window Query - Trailing Days in SQL

I am having trouble writing a sql statement. Here is the scenario:

- My table contains the member_id and calendar_date that members of my social community logged in as follows:

CREATE TABLE LOGIN_HISTORY
(
Login_Key int NOT NULL AUTO_INCREMENT, -- unique identifier for the login event
Member_ID int NOT NULL, -- the member who logged in
Login_Date DATETIME NOT NULL -- the date they logged in on
)

I am trying to write a SQL query that will allow me to create a trend chart of the following:

- How many members who logged in during the past 7-14 days also logged in during the past 1-7 days
- For example (given today is Jan 14), if a member logged into the community during Jan 1 - 7, they would get counted if they also logged in during Jan 8 - 14.

Since this is a trend of percent of returning users over time, I believe I need to run a 7 trailing days calculation in SQL and also be able to calculate the denominator (all logins within the period) as well as the numerator (all who return during the next 7 days).

Here is some sample data that would be in the source table (DDL specified above):
Login_Key Member_ID Login_Date
1 123 12/27/11
2 123 12/27/11 (logged in twice on this date, could have multiple logins on a day but they we do not count more than the first)
3 123 1/12/12
4 123 1/13/12

In case this question is difficult to follow, here is an example of what the result set might look like (if it is slightly different, I can make some adjustments, group by, etc): * Record_No included for the discussion below *

Week_Ending_Date Member_ID Logged_In_Prior_Week Logged_In_Current_Week Record_No
1/10/12 123 YES NO 1 (prior week = 12/27/11 - 1/2/12; current week = 1/3/12 - 1/10/12)
1/11/12 123 NO NO 2 (prior week = 12/28/11 - 1/3/12; current week = 1/4/12 - 1/11/12)
1/12/12 123 NO YES 3 (prior week = 12/29/11 - 1/4/12; current week = 1/5/12 - 1/12/12)
1/13/12 123 NO YES 4 (prior week = 12/30/11 - 1/5/12; current week = 1/6/12 - 1/13/12)

I will count the records that meet my criteria (record 1) and then divide (Logged_In_Current_Week/Logged_In_Prior_Week) and trend it across all of the days (even though Member_ID 123 did not contribute to 1/1/12, 1/11/12, and 1/13/12, other members likely would.

This is not as simple as looking for the week portion of the date and grouping by that as each day the trend report would change (as opposed to changing every seven days). I think I need a trailing 7-day calculation or a moving window. I also need to keep this flexible as someone will invariably want the same calculation for a 30-day look back (with prior period = 60-30 days ago and current period = 30-1 days ago).

I have a DATE dimension table that has a record for each day if that helps. If this is not possible with a query, then I will have to write a cursor to load a table and read from that but it seems possible with a query.

Thank you for contributing some thoughts.

Cheers.
Reply With Quote
  #2 (permalink)  
Old 12-25-11, 01:37
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
you probably don't want an ANSI SQL solution

is it okay to request that this thread be moved to the MySQL forum?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 12-25-11, 10:54
agg.jjk agg.jjk is offline
Registered User
 
Join Date: Dec 2011
Posts: 3
Hmmm... I don't think this is a problem that is specific to any DMBS platform. It is a query that should be able to be solved in the standard ANSI SQL. Does this make sense?

I just don't think this is a platform-specific query.
Reply With Quote
  #4 (permalink)  
Old 12-25-11, 13:04
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
ok, as you wish

just be advised that the ANSI window functions, like LAG specifically, will not be available in mysql
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 12-25-11, 14:46
agg.jjk agg.jjk is offline
Registered User
 
Join Date: Dec 2011
Posts: 3
Good point... I think it should be moved. Thanks for clarifying.
Reply With Quote
  #6 (permalink)  
Old 12-27-11, 05:41
shammat shammat is offline
Registered User
 
Join Date: Nov 2003
Posts: 2,407
An ANSI solution would probably involve something like

Code:
count(*) over (partition by Member_ID order by Login_Date asc) rows between current row and 14 preceding)
which would give you the number of logins for the last 14 rows for the "current row".
Reply With Quote
  #7 (permalink)  
Old 02-07-12, 17:02
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,191
How about this?

Example 1:
Code:
SELECT week_ending_date
     , member_id
     , COALESCE(
          MAX(
             CASE
             WHEN DAYS(login_date)
                  BETWEEN DAYS(week_ending_date) - 13
                      AND DAYS(week_ending_date) -  7 THEN
                  'Yes'
             END
          )
        , 'No'
       ) AS Logged_In_Prior_Week
     , COALESCE(
          MAX(
             CASE
             WHEN DAYS(login_date)
                  BETWEEN DAYS(week_ending_date) -  6
                      AND DAYS(week_ending_date)      THEN
                  'Yes'
             END
          )
        , 'No'
       ) AS Logged_In_Current_Week
 FROM  duration
     , login_history
 GROUP BY
       member_id
     , week_ending_date
;
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:

The following feature outside Core SQL-2003 is used:

F391, "Long identifiers"

The following vendor reserved word is used:

DAYS
DAYS function in DB2
Quote:
The result is 1 more than the number of days from January 1, 0001 to D, where D is the date that would occur if the DATE function were applied to the argument.
DAYS - IBM DB2 9.7 for Linux, UNIX, and Windows


Example 2: Tested on DB2 9.7.5 on Windows
Code:
------------------------------ Commands Entered ------------------------------
WITH
  Login_History
(Login_Key , Member_ID , Login_Date) AS (
VALUES
  ( 1 , 123 , '12/27/2011' )
, ( 2 , 123 , '12/27/2011' )
, ( 3 , 123 , '01/11/2012' )
, ( 4 , 123 , '01/12/2012' )
)
, Duration
(Week_Ending_Date) AS (
VALUES
  '01/09/2012'
, '01/10/2012'
, '01/11/2012'
, '01/12/2012'
)
SELECT week_ending_date
     , member_id
     , COALESCE(
          MAX(
             CASE
             WHEN DAYS(login_date)
                  BETWEEN DAYS(week_ending_date) - 13
                      AND DAYS(week_ending_date) -  7 THEN
                  'Yes'
             END
          )
        , 'No'
       ) AS Logged_In_Prior_Week
     , COALESCE(
          MAX(
             CASE
             WHEN DAYS(login_date)
                  BETWEEN DAYS(week_ending_date) -  6
                      AND DAYS(week_ending_date)      THEN
                  'Yes'
             END
          )
        , 'No'
       ) AS Logged_In_Current_Week
 FROM  duration
     , login_history
 GROUP BY
       member_id
     , week_ending_date
;
Result was...
Code:
------------------------------------------------------------------------------

WEEK_ENDING_DATE MEMBER_ID   LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012               123 Yes                  No                    
01/10/2012               123 No                   No                    
01/11/2012               123 No                   Yes                   
01/12/2012               123 No                   Yes                   

  4 record(s) selected.
Example 2 was
Code:
Result:

The following features outside Core SQL-2003 are used:

F641, "Row and table constructors"
F391, "Long identifiers"
T121, "WITH (excluding RECURSIVE) in query expression"
F661, "Simple tables"

The following vendor reserved word is used:

DAYS
Reply With Quote
  #8 (permalink)  
Old 02-07-12, 20:36
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,191
Although the Example 3 worked on DB2 9.7.5 on Windows,
it looks like that SQL 2003 standard doesn't support a expression in GROUP BY clause.

Example 3:
Code:
SELECT week_ending_date
     , COALESCE(p.member_id , c.member_id , n.member_id) AS member_id
     , CASE
       WHEN COUNT(p.member_id) > 0 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Prior_Week
     , CASE
       WHEN COUNT(c.member_id) > 0 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Current_Week
 FROM  duration      AS d
 LEFT  OUTER JOIN
       login_history AS p
   ON  DAYS(p.login_date)
       BETWEEN DAYS(d.week_ending_date) - 13
           AND DAYS(d.week_ending_date) -  7
 LEFT  OUTER JOIN
       login_history AS c
   ON  DAYS(c.login_date)
       BETWEEN DAYS(d.week_ending_date) -  6
           AND DAYS(d.week_ending_date)
 LEFT  OUTER JOIN
       login_history AS n
   ON  COALESCE(p.member_id , c.member_id) IS NULL
 GROUP BY
       week_ending_date
     , COALESCE(p.member_id , c.member_id , n.member_id)
;
------------------------------------------------------------------------------

WEEK_ENDING_DATE MEMBER_ID   LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012               123 Yes                  No                    
01/10/2012               123 No                   No                    
01/11/2012               123 No                   Yes                   
01/12/2012               123 No                   Yes                   

  4 record(s) selected.
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:

SELECT week_ending_date 
     , COALESCE(p.member_id , c.member_id , n.member_id) AS member_id 
     , CASE 
       WHEN COUNT(p.member_id) > 0 THEN 
            'Yes' 
       ELSE 'No' 
       END  AS Logged_In_Prior_Week 
     , CASE 
       WHEN COUNT(c.member_id) > 0 THEN 
            'Yes' 
       ELSE 'No' 
       END  AS Logged_In_Current_Week 
 FROM  duration      AS d 
 LEFT  OUTER JOIN 
       login_history AS p 
   ON  DAYS(p.login_date) 
       BETWEEN DAYS(d.week_ending_date) - 13 
           AND DAYS(d.week_ending_date) -  7 
 LEFT  OUTER JOIN 
       login_history AS c 
   ON  DAYS(c.login_date) 
       BETWEEN DAYS(d.week_ending_date) -  6 
           AND DAYS(d.week_ending_date) 
 LEFT  OUTER JOIN 
       login_history AS n 
   ON  COALESCE(p.member_id , c.member_id) IS NULL 
 GROUP BY 
       COALESCE(p.member_id , c.member_id , n.member_id) 
       ^-------
syntax error: COALESCE (
  correction: (

     , week_ending_date 
;

But, if moved the expression into FROM clause by using CROSS JOIN and LATERAL,
it conforms with SQL standard

Example 4:
The SQL-2003 Validator!

Enter your SQL statement(s) in the box below and simply click the "Test SQL" button:

Code:
SELECT week_ending_date
     , f.member_id
     , CASE
       WHEN COUNT(p.member_id) > 0 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Prior_Week
     , CASE
       WHEN COUNT(c.member_id) > 0 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Current_Week
 FROM  duration      AS d
 LEFT  OUTER JOIN
       login_history AS p
   ON  DAYS(p.login_date)
       BETWEEN DAYS(d.week_ending_date) - 13
           AND DAYS(d.week_ending_date) -  7
 LEFT  OUTER JOIN
       login_history AS c
   ON  DAYS(c.login_date)
       BETWEEN DAYS(d.week_ending_date) -  6
           AND DAYS(d.week_ending_date)
 LEFT  OUTER JOIN
       login_history AS n
   ON  COALESCE(p.member_id , c.member_id) IS NULL
 CROSS JOIN
 LATERAL (
       VALUES COALESCE(p.member_id , c.member_id , n.member_id)
       ) AS f(member_id)
 GROUP BY
       week_ending_date
     , f.member_id
;
Code:
Result:

The following features outside Core SQL-2003 are used:

F391, "Long identifiers"
T491, "LATERAL derived table"
F401, "Extended joined table"
F661, "Simple tables"

The following vendor reserved word is used:

DAYS

Last edited by tonkuma; 02-08-12 at 08:24. Reason: Remove Example 5
Reply With Quote
  #9 (permalink)  
Old 02-08-12, 09:03
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,191
Example 3 and Example 4 are not complete.
They wouldn't return some rows depending on data in login_history table.

Note: The rows which were not included in the results would satisfy "Logged_In_Prior_Week=No and Logged_In_Current_Week=No".
(Not all of the rows satisfied the condition. Some of them wouldn't be included.)


Another example.

Example 5:
Code:
SELECT d.week_ending_date
     , y.member_id
     , CASE
       WHEN MIN( DAYS(h.login_date) - DAYS(d.week_ending_date)
               ) <= -7 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Prior_Week
     , CASE
       WHEN MAX( DAYS(h.login_date) - DAYS(d.week_ending_date)
               ) >= -6 THEN
            'Yes'
       ELSE 'No'
       END  AS Logged_In_Current_Week
 FROM  duration      AS d
 CROSS JOIN (
       SELECT DISTINCT
              member_id
        FROM  login_history
       )             AS y
 LEFT  OUTER JOIN
       login_history AS h
   ON  h.member_id = y.member_id
   AND DAYS(h.login_date)
       BETWEEN DAYS(d.week_ending_date) - 13
           AND DAYS(d.week_ending_date)
 GROUP BY
       y.member_id
     , d.week_ending_date
;
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:

The following features outside Core SQL-2003 are used:

F391, "Long identifiers"
F591, "Derived tables"
F401, "Extended joined table"

The following vendor reserved word is used:

DAYS

Test data for DB2 9.7.5 on Windows.
Code:
SELECT * FROM login_history;
------------------------------------------------------------------------------

LOGIN_KEY   MEMBER_ID   LOGIN_DATE
----------- ----------- ----------
          1         123 12/27/2011
          2         123 12/27/2011
          3         123 01/11/2012
          4         123 01/12/2012
         11         456 12/26/2011
         12         456 12/27/2011
         13         456 01/12/2012
         14         456 01/12/2012

  8 record(s) selected.


SELECT * FROM duration;
------------------------------------------------------------------------------

WEEK_ENDING_DATE
----------------
01/09/2012      
01/10/2012      
01/11/2012      
01/12/2012      

  4 record(s) selected.
Result
Code:
WEEK_ENDING_DATE MEMBER_ID   LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012               123 Yes                  No                    
01/10/2012               123 No                   No                    
01/11/2012               123 No                   Yes                   
01/12/2012               123 No                   Yes                   
01/09/2012               456 Yes                  No                    
01/10/2012               456 No                   No                    
01/11/2012               456 No                   No                    <--- Not included in Example 3 nor Example 4
01/12/2012               456 No                   Yes                   

  8 record(s) selected.

Last edited by tonkuma; 02-08-12 at 09:10. Reason: Add Note for Example 3 and 4.
Reply With Quote
Reply

Tags
moving window, trailing days

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On