Results 1 to 12 of 12
  1. #1
    Join Date
    Sep 2010
    Posts
    6

    Unanswered: Regularizing time intervals

    I have a financial tick database where trades are recorded as they occur, timestamped with a field with millisecond precision. They are irregular.

    I now need to do calculations on this time series, eg. volatility, correlation with other time series, etc. All these calcs need to work on a time series of regularly spaced intervals. I need a query to pick the latest price at regular time intervals, eg. 1 minute.

    So the data looks like this:

    midprice timestamp
    1.234565 2010-05-05 16:04:23.330
    1.234533 2010-05-05 16:04:23.997
    1.234222 2010-05-05 16:04:24.102

    and I need a query to return the latest pricing of some regular time interval, like this

    midprice timestamp
    1.234533 2010-05-05 16:04:23
    1.234222 2010-05-05 16:04:24
    1.231222 2010-05-05 16:04:25

    The tricky part is grouping the data by some time interval and then finding the price corresponding to the latest time stamp within that interval.

    Any help greatly appreciated.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Something like:
    Code:
    DECLARE @t TABLE (
       midpoint	MONEY
    ,  timestamp	DATETIME
       )
       
    INSERT INTO @t
       SELECT 1.234565, '2010-05-05 16:04:23.330' UNION
       SELECT 1.234533, '2010-05-05 16:04:23.997' UNION
       SELECT 1.234222, '2010-05-05 16:04:24.102'
    
    SELECT * FROM @t
    
    SELECT AVG(midpoint)
    ,  DateAdd(second, DATEDIFF(second, DateAdd(day
    ,     DateDiff(day, 0, timestamp), 0), timestamp)
    ,     DateAdd(day, DateDiff(day, 0, timestamp), 0))
       FROM @t
       GROUP BY DateAdd(second, DATEDIFF(second
    ,     DateAdd(day, DateDiff(day, 0, timestamp), 0)
    ,     timestamp), DateAdd(day, DateDiff(day, 0
    ,     timestamp), 0))
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Sep 2010
    Posts
    6

    How to get last price within a time interval

    Neat approach, thanks. The problem I'm having is the AVG aggregate function operating over the grouped rows. What I need to do is to select the price with the *latest* time stamp within each grouped set of rows. I'm not quite sure how to do something that looks for the price column in the row that has the MAX(timestamp) within each grouped set.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    How do you want to cope with ties (two or more rows with the same timestamp)?

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Sep 2010
    Posts
    6

    Two rows with same timestamp but different prices

    Question - How do you want to cope with ties (two or more rows with the same timestamp)?

    Answer: Relevant question, and it does occur quite often that one has multiple rows with the same timestamp and different prices. Although this would have to occur for rows that are at the end of the grouped set of events. If that ever occurs, I'd take the average of the prices for rows with the same timestamp.

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    DECLARE @t TABLE (
       midpoint	MONEY
    ,  timestamp	DATETIME
       )
       
    INSERT INTO @t
       SELECT 1.234565, '2010-05-05 16:04:23.330' UNION
       SELECT 1.234533, '2010-05-05 16:04:23.997' UNION
       SELECT 1.234222, '2010-05-05 16:04:24.102'
    
    SELECT * FROM @t
    
    SELECT (SELECT AVG(z.midpoint)
          FROM @t AS z
          WHERE  z.timestamp = MAX(a.timestamp)) AS last_midpoint
    ,  DateAdd(second, DATEDIFF(second, DateAdd(day
    ,     DateDiff(day, 0, timestamp), 0), timestamp)
    ,     DateAdd(day, DateDiff(day, 0, timestamp), 0))
       FROM @t AS a
       GROUP BY DateAdd(second, DATEDIFF(second
    ,     DateAdd(day, DateDiff(day, 0, timestamp), 0)
    ,     timestamp), DateAdd(day, DateDiff(day, 0
    ,     timestamp), 0))
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  7. #7
    Join Date
    Sep 2010
    Posts
    6

    Elegant solution

    This is an elegant solution, and I am deeply appreciative of your help! Thank you kindly.

  8. #8
    Join Date
    Sep 2010
    Posts
    1

    Thumbs up

    I am trying to work out what is going on in the nested dateadd and datediff calculations. Could you perhaps explain in simple terms what each layer of the nested calculations is doing? Its very elegant. How might you extend this approach to regularizing time intervals into different slices, eg 15 minutes?
    Many thanks

  9. #9
    Join Date
    Sep 2010
    Posts
    6

    Another approach

    One alternative approach to group data by seconds, without having so many nested DateAdd and DateDiff calculations, might be the following:

    DateAdd(second,
    DateDiff(second,
    Convert(datetime ,'2010/01/01'), timestamp),
    Convert(datetime ,'2010/01/01')
    )


    The reason why I think one can use this approach is that it references the start time off 2010/01/01 rather than 1900/01/01, so it doesn't cause an integer overflow if one just uses the number of seconds (which can be up to 68 years), rather than separately using the number of days and then the number of seconds.

    Pat - any idea which takes longer to process for iterative calcs like these - your approach with nested DateDiff calculations, or doing these 'convert' calculations?

  10. #10
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    My calculation doesn't break no matter what timespan you use, it works for any valid DATETIME value. Yours is easier to type, but it is "fragile" in the sense that there are a number of problems with integer math that can cause unexpected results if you aren't careful.

    Your method works for specific sets of data, and it is easier to type. Mine doesn't break unexpectedly. Time wise any difference should be trivial.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  11. #11
    Join Date
    Sep 2010
    Posts
    6

    How to do in SSIS?

    Pat, it would be useful to do the same calculation on live streaming data, using SSIS. Do you have any thoughts on how one might approach this in SSIS?

  12. #12
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'd do it the same way.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •