| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

12-24-11, 22:21
|
|
Registered User
|
|
Join Date: Dec 2011
Posts: 3
|
|
Moving Window Query - Trailing Days in SQL
|
|
I am having trouble writing a sql statement. Here is the scenario:
- My table contains the member_id and calendar_date that members of my social community logged in as follows:
CREATE TABLE LOGIN_HISTORY
(
Login_Key int NOT NULL AUTO_INCREMENT, -- unique identifier for the login event
Member_ID int NOT NULL, -- the member who logged in
Login_Date DATETIME NOT NULL -- the date they logged in on
)
I am trying to write a SQL query that will allow me to create a trend chart of the following:
- How many members who logged in during the past 7-14 days also logged in during the past 1-7 days
- For example (given today is Jan 14), if a member logged into the community during Jan 1 - 7, they would get counted if they also logged in during Jan 8 - 14.
Since this is a trend of percent of returning users over time, I believe I need to run a 7 trailing days calculation in SQL and also be able to calculate the denominator (all logins within the period) as well as the numerator (all who return during the next 7 days).
Here is some sample data that would be in the source table (DDL specified above):
Login_Key Member_ID Login_Date
1 123 12/27/11
2 123 12/27/11 (logged in twice on this date, could have multiple logins on a day but they we do not count more than the first)
3 123 1/12/12
4 123 1/13/12
In case this question is difficult to follow, here is an example of what the result set might look like (if it is slightly different, I can make some adjustments, group by, etc): * Record_No included for the discussion below *
Week_Ending_Date Member_ID Logged_In_Prior_Week Logged_In_Current_Week Record_No
1/10/12 123 YES NO 1 (prior week = 12/27/11 - 1/2/12; current week = 1/3/12 - 1/10/12)
1/11/12 123 NO NO 2 (prior week = 12/28/11 - 1/3/12; current week = 1/4/12 - 1/11/12)
1/12/12 123 NO YES 3 (prior week = 12/29/11 - 1/4/12; current week = 1/5/12 - 1/12/12)
1/13/12 123 NO YES 4 (prior week = 12/30/11 - 1/5/12; current week = 1/6/12 - 1/13/12)
I will count the records that meet my criteria (record 1) and then divide (Logged_In_Current_Week/Logged_In_Prior_Week) and trend it across all of the days (even though Member_ID 123 did not contribute to 1/1/12, 1/11/12, and 1/13/12, other members likely would.
This is not as simple as looking for the week portion of the date and grouping by that as each day the trend report would change (as opposed to changing every seven days). I think I need a trailing 7-day calculation or a moving window. I also need to keep this flexible as someone will invariably want the same calculation for a 30-day look back (with prior period = 60-30 days ago and current period = 30-1 days ago).
I have a DATE dimension table that has a record for each day if that helps. If this is not possible with a query, then I will have to write a cursor to load a table and read from that but it seems possible with a query.
Thank you for contributing some thoughts.
Cheers.
|
|

12-25-11, 01:37
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
you probably don't want an ANSI SQL solution
is it okay to request that this thread be moved to the MySQL forum?
|
|

12-25-11, 10:54
|
|
Registered User
|
|
Join Date: Dec 2011
Posts: 3
|
|
|
|
Hmmm... I don't think this is a problem that is specific to any DMBS platform. It is a query that should be able to be solved in the standard ANSI SQL. Does this make sense?
I just don't think this is a platform-specific query.
|
|

12-25-11, 13:04
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
ok, as you wish
just be advised that the ANSI window functions, like LAG specifically, will not be available in mysql
|
|

12-25-11, 14:46
|
|
Registered User
|
|
Join Date: Dec 2011
Posts: 3
|
|
Good point... I think it should be moved. Thanks for clarifying.
|
|

12-27-11, 05:41
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 2,407
|
|
An ANSI solution would probably involve something like
Code:
count(*) over (partition by Member_ID order by Login_Date asc) rows between current row and 14 preceding)
which would give you the number of logins for the last 14 rows for the "current row".
|
|

02-07-12, 17:02
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,191
|
|
How about this?
Example 1:
Code:
SELECT week_ending_date
, member_id
, COALESCE(
MAX(
CASE
WHEN DAYS(login_date)
BETWEEN DAYS(week_ending_date) - 13
AND DAYS(week_ending_date) - 7 THEN
'Yes'
END
)
, 'No'
) AS Logged_In_Prior_Week
, COALESCE(
MAX(
CASE
WHEN DAYS(login_date)
BETWEEN DAYS(week_ending_date) - 6
AND DAYS(week_ending_date) THEN
'Yes'
END
)
, 'No'
) AS Logged_In_Current_Week
FROM duration
, login_history
GROUP BY
member_id
, week_ending_date
;
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:
The following feature outside Core SQL-2003 is used:
F391, "Long identifiers"
The following vendor reserved word is used:
DAYS
DAYS function in DB2
Quote:
|
The result is 1 more than the number of days from January 1, 0001 to D, where D is the date that would occur if the DATE function were applied to the argument.
|
DAYS - IBM DB2 9.7 for Linux, UNIX, and Windows
Example 2: Tested on DB2 9.7.5 on Windows
Code:
------------------------------ Commands Entered ------------------------------
WITH
Login_History
(Login_Key , Member_ID , Login_Date) AS (
VALUES
( 1 , 123 , '12/27/2011' )
, ( 2 , 123 , '12/27/2011' )
, ( 3 , 123 , '01/11/2012' )
, ( 4 , 123 , '01/12/2012' )
)
, Duration
(Week_Ending_Date) AS (
VALUES
'01/09/2012'
, '01/10/2012'
, '01/11/2012'
, '01/12/2012'
)
SELECT week_ending_date
, member_id
, COALESCE(
MAX(
CASE
WHEN DAYS(login_date)
BETWEEN DAYS(week_ending_date) - 13
AND DAYS(week_ending_date) - 7 THEN
'Yes'
END
)
, 'No'
) AS Logged_In_Prior_Week
, COALESCE(
MAX(
CASE
WHEN DAYS(login_date)
BETWEEN DAYS(week_ending_date) - 6
AND DAYS(week_ending_date) THEN
'Yes'
END
)
, 'No'
) AS Logged_In_Current_Week
FROM duration
, login_history
GROUP BY
member_id
, week_ending_date
;
Result was...
Code:
------------------------------------------------------------------------------
WEEK_ENDING_DATE MEMBER_ID LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012 123 Yes No
01/10/2012 123 No No
01/11/2012 123 No Yes
01/12/2012 123 No Yes
4 record(s) selected.
Example 2 was
Code:
Result:
The following features outside Core SQL-2003 are used:
F641, "Row and table constructors"
F391, "Long identifiers"
T121, "WITH (excluding RECURSIVE) in query expression"
F661, "Simple tables"
The following vendor reserved word is used:
DAYS
|
|

02-07-12, 20:36
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,191
|
|
Although the Example 3 worked on DB2 9.7.5 on Windows,
it looks like that SQL 2003 standard doesn't support a expression in GROUP BY clause.
Example 3:
Code:
SELECT week_ending_date
, COALESCE(p.member_id , c.member_id , n.member_id) AS member_id
, CASE
WHEN COUNT(p.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Prior_Week
, CASE
WHEN COUNT(c.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Current_Week
FROM duration AS d
LEFT OUTER JOIN
login_history AS p
ON DAYS(p.login_date)
BETWEEN DAYS(d.week_ending_date) - 13
AND DAYS(d.week_ending_date) - 7
LEFT OUTER JOIN
login_history AS c
ON DAYS(c.login_date)
BETWEEN DAYS(d.week_ending_date) - 6
AND DAYS(d.week_ending_date)
LEFT OUTER JOIN
login_history AS n
ON COALESCE(p.member_id , c.member_id) IS NULL
GROUP BY
week_ending_date
, COALESCE(p.member_id , c.member_id , n.member_id)
;
------------------------------------------------------------------------------
WEEK_ENDING_DATE MEMBER_ID LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012 123 Yes No
01/10/2012 123 No No
01/11/2012 123 No Yes
01/12/2012 123 No Yes
4 record(s) selected.
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:
SELECT week_ending_date
, COALESCE(p.member_id , c.member_id , n.member_id) AS member_id
, CASE
WHEN COUNT(p.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Prior_Week
, CASE
WHEN COUNT(c.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Current_Week
FROM duration AS d
LEFT OUTER JOIN
login_history AS p
ON DAYS(p.login_date)
BETWEEN DAYS(d.week_ending_date) - 13
AND DAYS(d.week_ending_date) - 7
LEFT OUTER JOIN
login_history AS c
ON DAYS(c.login_date)
BETWEEN DAYS(d.week_ending_date) - 6
AND DAYS(d.week_ending_date)
LEFT OUTER JOIN
login_history AS n
ON COALESCE(p.member_id , c.member_id) IS NULL
GROUP BY
COALESCE(p.member_id , c.member_id , n.member_id)
^-------
syntax error: COALESCE (
correction: (
, week_ending_date
;
But, if moved the expression into FROM clause by using CROSS JOIN and LATERAL,
it conforms with SQL standard
Example 4:
The SQL-2003 Validator!
Enter your SQL statement(s) in the box below and simply click the "Test SQL" button:
Code:
SELECT week_ending_date
, f.member_id
, CASE
WHEN COUNT(p.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Prior_Week
, CASE
WHEN COUNT(c.member_id) > 0 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Current_Week
FROM duration AS d
LEFT OUTER JOIN
login_history AS p
ON DAYS(p.login_date)
BETWEEN DAYS(d.week_ending_date) - 13
AND DAYS(d.week_ending_date) - 7
LEFT OUTER JOIN
login_history AS c
ON DAYS(c.login_date)
BETWEEN DAYS(d.week_ending_date) - 6
AND DAYS(d.week_ending_date)
LEFT OUTER JOIN
login_history AS n
ON COALESCE(p.member_id , c.member_id) IS NULL
CROSS JOIN
LATERAL (
VALUES COALESCE(p.member_id , c.member_id , n.member_id)
) AS f(member_id)
GROUP BY
week_ending_date
, f.member_id
;
Code:
Result:
The following features outside Core SQL-2003 are used:
F391, "Long identifiers"
T491, "LATERAL derived table"
F401, "Extended joined table"
F661, "Simple tables"
The following vendor reserved word is used:
DAYS
|
Last edited by tonkuma; 02-08-12 at 08:24.
Reason: Remove Example 5
|

02-08-12, 09:03
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,191
|
|
Example 3 and Example 4 are not complete.
They wouldn't return some rows depending on data in login_history table.
Note: The rows which were not included in the results would satisfy "Logged_In_Prior_Week=No and Logged_In_Current_Week=No".
(Not all of the rows satisfied the condition. Some of them wouldn't be included.)
Another example.
Example 5:
Code:
SELECT d.week_ending_date
, y.member_id
, CASE
WHEN MIN( DAYS(h.login_date) - DAYS(d.week_ending_date)
) <= -7 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Prior_Week
, CASE
WHEN MAX( DAYS(h.login_date) - DAYS(d.week_ending_date)
) >= -6 THEN
'Yes'
ELSE 'No'
END AS Logged_In_Current_Week
FROM duration AS d
CROSS JOIN (
SELECT DISTINCT
member_id
FROM login_history
) AS y
LEFT OUTER JOIN
login_history AS h
ON h.member_id = y.member_id
AND DAYS(h.login_date)
BETWEEN DAYS(d.week_ending_date) - 13
AND DAYS(d.week_ending_date)
GROUP BY
y.member_id
, d.week_ending_date
;
Tested on Mimer SQL Developers - Mimer SQL-2003 Validator
Code:
Result:
The following features outside Core SQL-2003 are used:
F391, "Long identifiers"
F591, "Derived tables"
F401, "Extended joined table"
The following vendor reserved word is used:
DAYS
Test data for DB2 9.7.5 on Windows.
Code:
SELECT * FROM login_history;
------------------------------------------------------------------------------
LOGIN_KEY MEMBER_ID LOGIN_DATE
----------- ----------- ----------
1 123 12/27/2011
2 123 12/27/2011
3 123 01/11/2012
4 123 01/12/2012
11 456 12/26/2011
12 456 12/27/2011
13 456 01/12/2012
14 456 01/12/2012
8 record(s) selected.
SELECT * FROM duration;
------------------------------------------------------------------------------
WEEK_ENDING_DATE
----------------
01/09/2012
01/10/2012
01/11/2012
01/12/2012
4 record(s) selected.
Result
Code:
WEEK_ENDING_DATE MEMBER_ID LOGGED_IN_PRIOR_WEEK LOGGED_IN_CURRENT_WEEK
---------------- ----------- -------------------- ----------------------
01/09/2012 123 Yes No
01/10/2012 123 No No
01/11/2012 123 No Yes
01/12/2012 123 No Yes
01/09/2012 456 Yes No
01/10/2012 456 No No
01/11/2012 456 No No <--- Not included in Example 3 nor Example 4
01/12/2012 456 No Yes
8 record(s) selected.
|
Last edited by tonkuma; 02-08-12 at 09:10.
Reason: Add Note for Example 3 and 4.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|