Results 1 to 7 of 7
  1. #1
    Join Date
    Nov 2011
    Posts
    3

    Unanswered: Database with varying timestamp

    Hi, I'm fairly new to the use of databases. I have a rather unusual query. I have a set of data which is archived daily. Every day a new line is added to the archive with 1024 values. These correspond to measurements at 15-minute intervals. Therefore, every day a line is added containing around 10 days worth of data. Therefore, there is an overlap in each line of data. In addition, the archive is not always created at the same time, so the timestamp of the first value changes depending on when it was created.

    Probably best if I give an example of the data:

    15/04/2011 09:15:00, 24.9, 24.3, 24,6, 24.8 .... (1024 values)
    16/04/2011 09:15:00, 25.5, 24.3, 24,4, 24.8 .... (1024 values)
    18/04/2011 10:30:00, 27.5, 27.4, 27,6, 27.4 .... (1024 values)
    19/04/2011 09:15:00, 28.5, 28.3, 28,4, 28.8 .... (1024 values)

    In addition there are some days where data is not recorded. So my problem is trying to put all of this data in sequence. As this is a constant stream of data, I need to automate the process. I have tried using Excel and VBA. However, it's getting extremely frustrating to get the result i'm looking for. Can anyone here help or point me in the right direction.

    Thanks in advance

  2. #2
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    People just sometimes have the weirdest problems on this forum

    I don't know if it is the most elegant solution, but I'm pretty sure it will do the job:
    Code:
    -- source table
    DROP TABLE #DaTable
    CREATE TABLE #DaTable(
    	datime DATETIME	NOT NULL,
    	val1	dec(5,1)	NOT NULL,
    	val2	dec(5,1)	NOT NULL,
    	val3	dec(5,1)	NOT NULL,
    	val4	dec(5,1)	NOT NULL
    )
    
    -- final normalised table
    DROP TABLE #MyTable
    CREATE TABLE #MyTable(
    	dt DATETIME	NOT NULL,
    	val	dec(5,1)	NOT NULL,
    	CONSTRAINT PK_MyTable PRIMARY KEY (dt)
    )
    
    -- temporary table, with the same structure as final table
    DROP TABLE #TempMyTable
    CREATE TABLE #TempMyTable(
    	dt DATETIME	NOT NULL,
    	val	dec(5,1)	NOT NULL
    )
    
    -- populate source table
    INSERT INTO #DaTable(datime, val1, val2, val3, val4) VALUES
    (CONVERT(datetime, '15/04/2011 09:15:00', 103), 24.9, 24.3, 24.6, 24.8),
    (CONVERT(datetime, '16/04/2011 09:15:00', 103), 25.5, 24.3, 24.4, 24.8),
    (CONVERT(datetime, '18/04/2011 10:30:00', 103), 27.5, 27.4, 27.6, 27.4),
    (CONVERT(datetime, '19/04/2011 09:15:00', 103), 28.5, 28.3, 28.4, 28.8)
    
    select * from #DaTable
    
    -- empty temp table
    TRUNCATE TABLE #TempMyTable
    
    -- a variable that holds the datetime of the last source table record that was processed (*)
    DECLARE @lastDatetimeProcessed	DATETIME
    SET @lastDatetimeProcessed = '1753-01-01 00:00:01'
    
    -- insert first datetime dt/value pair 
    -- with datetime more recent than @lastDatetimeProcessed
    INSERT INTO #TempMyTable (dt, val)
    select datime, val1
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    
    -- insert second datetime dt/value pair. dt = datime + 15 minutes
    INSERT INTO #TempMyTable (dt, val)
    select DATEADD(minute, 15 * 1, datime), val2
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    
    -- insert second datetime dt/value pair. dt = datime + (15 minutes * 2)
    INSERT INTO #TempMyTable (dt, val)
    select DATEADD(minute, 15 * 2, datime), val3
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    
    -- insert second datetime dt/value pair. dt = datime + (15 minutes * 3)
    INSERT INTO #TempMyTable (dt, val)
    select DATEADD(minute, 15 * 3, datime), val4
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    
    -- ....
    
    -- insert last datetime dt/value pair. dt = datime + (15 minutes * 1024)
    INSERT INTO #TempMyTable (dt, val)
    select DATEADD(minute, 15 * 1023, datime), val1024
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    
    SELECT distinct * FROM #TempMyTable
    
    -- Insert distinct values from temp table into the final destination table, 
    -- check that that record (same datetime) is not yet in the final table already
    INSERT INTO #MyTable(dt, val)
    SELECT DISTINCT T.dt, T.val
    FROM #TempMyTable as T
    	LEFT OUTER JOIN #MyTable as M ON
    		T.dt = M.dt
    WHERE M.dt IS NULL
    
    -- set @lastDatetimeProcessed to the last datetime in source table
    -- store this value somewhere so you can use it (*) next time this script needs to run again
    SELECT @lastDatetimeProcessed = MAX(datime) from #DaTable
    
    select * from #MyTable
    It assumes that in cases of overlapping records (different records with same datetime) the values will be the same too.

    <shields up>
    You could write code to generate dynamic SQL for the 1023 DML's
    Code:
    INSERT INTO #TempMyTable (dt, val)
    select DATEADD(minute, 15 * 3, datime), val4
    from #DaTable
    WHERE datime > @lastDatetimeProcessed
    <shields down>
    Last edited by Wim; 11-05-11 at 06:59. Reason: Corrected an arror in the DDT for #TempMyTable. Removed the PK definition. It must be able to store duplicate dt's.
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  3. #3
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    Just in case you wondered ...
    <shields up>
    Code:
    DECLARE @i INT
    DECLARE @params nvarchar(100)
    DECLARE @datetime nvarchar(20)
    DECLARE @sql nvarchar(400)
    
    SET @i = 2
    SET @params = N'@lastDatetimeProcessed datetime '
    SELECT @datetime = CONVERT(varchar(20), @lastDatetimeProcessed, 120)
    
    WHILE @i <= 1024
    BEGIN
    	SET @sql =	' INSERT INTO #TempMyTable (dt, val) ' +
    			' select DATEADD(minute, ' + CAST(15 * (@i - 1) as VARCHAR(5)) + ', datime), val' + CAST(@i as VARCHAR(4)) + ' ' + 
    			' from #DaTable ' +
    			' WHERE datime > @lastDatetimeProcessed '
    	--print @sql
    	EXEC sp_executesql @sql, @params, @datetime
    
    	SET @i = @i + 1
    END
    <shields down>
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  4. #4
    Join Date
    Nov 2011
    Posts
    3
    Thanks for the detailed response Wim. Now I just need to figure out what it all means and how I'm going to fit everything together I'm sort of getting a crash course in SQL here at the minute. I'm working on a PhD project which requires processing large quantities of data from Building Monitoring systems. Unfortunately, this is just the groundwork and there's a long way to go yet. I seem to have underestimated the work required in processing this stuff So your help is very much appreciated. Thanks a million

  5. #5
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    Can you give the structure of the tables involved? Like the name of the source table (the one with the 1024 values) and the name of its columns?

    Now I just need to figure out what it all means and how I'm going to fit everything together
    What is unclear about the solution?
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  6. #6
    Join Date
    Nov 2011
    Posts
    3
    Quote Originally Posted by Wim View Post
    What is unclear about the solution?
    Apologies, I didn't mean the solution was unclear. I just lack the experience with SQL at present so I'm currently on a bit of a steep learning curve.

    Quote Originally Posted by Wim View Post
    Can you give the structure of the tables involved? Like the name of the source table (the one with the 1024 values) and the name of its columns?
    Here is a link to one of the files I'm working with: Basement Temp. Basically, I have 9 of these files (for temperature values) and I'm trying to assimilate the data from each. They all follow the same format and the timestamps are usually the same for each. However, I cannot assume this is correct so I need to check.

    The columns are not labelled as you can see so the timestamp in column A is just associated with the first value given in column C.

    Hope this helps. Thanks again for the input, it's a massive help.

  7. #7
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    Skimming through the file you gave, I noticed that the data consists of the datetime, the number of measures, and then the values of all those measures (max 1024).

    You are dealing with an unusual situation. 1024 values in one record is beyond madness in a relational database.

    Normally I would load the CSV file into a table and do the processing from there on.
    But SQL Server imposes some restrictions:
    - there is a limit of 1024 columns per record, and you will need 1026 (datime, #values, 1024 values)
    - there is a limit of 8060 bytes per record
    A single table is not capable of holding all those columns.

    You will have to load those CSV files in two phases, the first time the first half of the columns, in the second phase the other columns.

    Perhaps looking for another database that can cope with such a huge number of columns may be better at the end. DB2 nor Oracle allow more columns. MySQL and PostgreSQL seem to be better choices.

    Perhaps you should tackle the first step of this problem with something else than with a database. How good is your awk knowledge? Write an awk script that does the splitting of the huge rows in the source CSV files into small normalised datetime, value rows that you can easily load in a database table. Add a DISTINCT to the SELECT to remove all the duplicates, and you're set.
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •