If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > Microsoft SQL Server > Database with varying timestamp

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-04-11, 10:11
dcoakley dcoakley is offline
Registered User
 
Join Date: Nov 2011
Posts: 3
Database with varying timestamp

Hi, I'm fairly new to the use of databases. I have a rather unusual query. I have a set of data which is archived daily. Every day a new line is added to the archive with 1024 values. These correspond to measurements at 15-minute intervals. Therefore, every day a line is added containing around 10 days worth of data. Therefore, there is an overlap in each line of data. In addition, the archive is not always created at the same time, so the timestamp of the first value changes depending on when it was created.

Probably best if I give an example of the data:

15/04/2011 09:15:00, 24.9, 24.3, 24,6, 24.8 .... (1024 values)
16/04/2011 09:15:00, 25.5, 24.3, 24,4, 24.8 .... (1024 values)
18/04/2011 10:30:00, 27.5, 27.4, 27,6, 27.4 .... (1024 values)
19/04/2011 09:15:00, 28.5, 28.3, 28,4, 28.8 .... (1024 values)

In addition there are some days where data is not recorded. So my problem is trying to put all of this data in sequence. As this is a constant stream of data, I need to automate the process. I have tried using Excel and VBA. However, it's getting extremely frustrating to get the result i'm looking for. Can anyone here help or point me in the right direction.

Thanks in advance
Reply With Quote
  #2 (permalink)  
Old 11-04-11, 17:13
Wim Wim is offline
Registered User
 
Join Date: Nov 2004
Posts: 1,279
People just sometimes have the weirdest problems on this forum

I don't know if it is the most elegant solution, but I'm pretty sure it will do the job:
Code:
-- source table
DROP TABLE #DaTable
CREATE TABLE #DaTable(
	datime DATETIME	NOT NULL,
	val1	dec(5,1)	NOT NULL,
	val2	dec(5,1)	NOT NULL,
	val3	dec(5,1)	NOT NULL,
	val4	dec(5,1)	NOT NULL
)

-- final normalised table
DROP TABLE #MyTable
CREATE TABLE #MyTable(
	dt DATETIME	NOT NULL,
	val	dec(5,1)	NOT NULL,
	CONSTRAINT PK_MyTable PRIMARY KEY (dt)
)

-- temporary table, with the same structure as final table
DROP TABLE #TempMyTable
CREATE TABLE #TempMyTable(
	dt DATETIME	NOT NULL,
	val	dec(5,1)	NOT NULL
)

-- populate source table
INSERT INTO #DaTable(datime, val1, val2, val3, val4) VALUES
(CONVERT(datetime, '15/04/2011 09:15:00', 103), 24.9, 24.3, 24.6, 24.8),
(CONVERT(datetime, '16/04/2011 09:15:00', 103), 25.5, 24.3, 24.4, 24.8),
(CONVERT(datetime, '18/04/2011 10:30:00', 103), 27.5, 27.4, 27.6, 27.4),
(CONVERT(datetime, '19/04/2011 09:15:00', 103), 28.5, 28.3, 28.4, 28.8)

select * from #DaTable

-- empty temp table
TRUNCATE TABLE #TempMyTable

-- a variable that holds the datetime of the last source table record that was processed (*)
DECLARE @lastDatetimeProcessed	DATETIME
SET @lastDatetimeProcessed = '1753-01-01 00:00:01'

-- insert first datetime dt/value pair 
-- with datetime more recent than @lastDatetimeProcessed
INSERT INTO #TempMyTable (dt, val)
select datime, val1
from #DaTable
WHERE datime > @lastDatetimeProcessed

-- insert second datetime dt/value pair. dt = datime + 15 minutes
INSERT INTO #TempMyTable (dt, val)
select DATEADD(minute, 15 * 1, datime), val2
from #DaTable
WHERE datime > @lastDatetimeProcessed

-- insert second datetime dt/value pair. dt = datime + (15 minutes * 2)
INSERT INTO #TempMyTable (dt, val)
select DATEADD(minute, 15 * 2, datime), val3
from #DaTable
WHERE datime > @lastDatetimeProcessed

-- insert second datetime dt/value pair. dt = datime + (15 minutes * 3)
INSERT INTO #TempMyTable (dt, val)
select DATEADD(minute, 15 * 3, datime), val4
from #DaTable
WHERE datime > @lastDatetimeProcessed

-- ....

-- insert last datetime dt/value pair. dt = datime + (15 minutes * 1024)
INSERT INTO #TempMyTable (dt, val)
select DATEADD(minute, 15 * 1023, datime), val1024
from #DaTable
WHERE datime > @lastDatetimeProcessed

SELECT distinct * FROM #TempMyTable

-- Insert distinct values from temp table into the final destination table, 
-- check that that record (same datetime) is not yet in the final table already
INSERT INTO #MyTable(dt, val)
SELECT DISTINCT T.dt, T.val
FROM #TempMyTable as T
	LEFT OUTER JOIN #MyTable as M ON
		T.dt = M.dt
WHERE M.dt IS NULL

-- set @lastDatetimeProcessed to the last datetime in source table
-- store this value somewhere so you can use it (*) next time this script needs to run again
SELECT @lastDatetimeProcessed = MAX(datime) from #DaTable

select * from #MyTable
It assumes that in cases of overlapping records (different records with same datetime) the values will be the same too.

<shields up>
You could write code to generate dynamic SQL for the 1023 DML's
Code:
INSERT INTO #TempMyTable (dt, val)
select DATEADD(minute, 15 * 3, datime), val4
from #DaTable
WHERE datime > @lastDatetimeProcessed
<shields down>
__________________
With kind regards . . . . . SQL Server 2000/2005/2008/2008 R2 Earned beers: 16
Wim
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

Last edited by Wim; 11-05-11 at 05:59. Reason: Corrected an arror in the DDT for #TempMyTable. Removed the PK definition. It must be able to store duplicate dt's.
Reply With Quote
  #3 (permalink)  
Old 11-04-11, 18:33
Wim Wim is offline
Registered User
 
Join Date: Nov 2004
Posts: 1,279
Just in case you wondered ...
<shields up>
Code:
DECLARE @i INT
DECLARE @params nvarchar(100)
DECLARE @datetime nvarchar(20)
DECLARE @sql nvarchar(400)

SET @i = 2
SET @params = N'@lastDatetimeProcessed datetime '
SELECT @datetime = CONVERT(varchar(20), @lastDatetimeProcessed, 120)

WHILE @i <= 1024
BEGIN
	SET @sql =	' INSERT INTO #TempMyTable (dt, val) ' +
			' select DATEADD(minute, ' + CAST(15 * (@i - 1) as VARCHAR(5)) + ', datime), val' + CAST(@i as VARCHAR(4)) + ' ' + 
			' from #DaTable ' +
			' WHERE datime > @lastDatetimeProcessed '
	--print @sql
	EXEC sp_executesql @sql, @params, @datetime

	SET @i = @i + 1
END
<shields down>
__________________
With kind regards . . . . . SQL Server 2000/2005/2008/2008 R2 Earned beers: 16
Wim
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages
Reply With Quote
  #4 (permalink)  
Old 11-04-11, 20:35
dcoakley dcoakley is offline
Registered User
 
Join Date: Nov 2011
Posts: 3
Thanks for the detailed response Wim. Now I just need to figure out what it all means and how I'm going to fit everything together I'm sort of getting a crash course in SQL here at the minute. I'm working on a PhD project which requires processing large quantities of data from Building Monitoring systems. Unfortunately, this is just the groundwork and there's a long way to go yet. I seem to have underestimated the work required in processing this stuff So your help is very much appreciated. Thanks a million
Reply With Quote
  #5 (permalink)  
Old 11-05-11, 06:05
Wim Wim is offline
Registered User
 
Join Date: Nov 2004
Posts: 1,279
Can you give the structure of the tables involved? Like the name of the source table (the one with the 1024 values) and the name of its columns?

Quote:
Now I just need to figure out what it all means and how I'm going to fit everything together
What is unclear about the solution?
__________________
With kind regards . . . . . SQL Server 2000/2005/2008/2008 R2 Earned beers: 16
Wim
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages
Reply With Quote
  #6 (permalink)  
Old 11-05-11, 13:50
dcoakley dcoakley is offline
Registered User
 
Join Date: Nov 2011
Posts: 3
Quote:
Originally Posted by Wim View Post
What is unclear about the solution?
Apologies, I didn't mean the solution was unclear. I just lack the experience with SQL at present so I'm currently on a bit of a steep learning curve.

Quote:
Originally Posted by Wim View Post
Can you give the structure of the tables involved? Like the name of the source table (the one with the 1024 values) and the name of its columns?
Here is a link to one of the files I'm working with: Basement Temp. Basically, I have 9 of these files (for temperature values) and I'm trying to assimilate the data from each. They all follow the same format and the timestamps are usually the same for each. However, I cannot assume this is correct so I need to check.

The columns are not labelled as you can see so the timestamp in column A is just associated with the first value given in column C.

Hope this helps. Thanks again for the input, it's a massive help.
Reply With Quote
  #7 (permalink)  
Old 11-05-11, 19:13
Wim Wim is offline
Registered User
 
Join Date: Nov 2004
Posts: 1,279
Skimming through the file you gave, I noticed that the data consists of the datetime, the number of measures, and then the values of all those measures (max 1024).

You are dealing with an unusual situation. 1024 values in one record is beyond madness in a relational database.

Normally I would load the CSV file into a table and do the processing from there on.
But SQL Server imposes some restrictions:
- there is a limit of 1024 columns per record, and you will need 1026 (datime, #values, 1024 values)
- there is a limit of 8060 bytes per record
A single table is not capable of holding all those columns.

You will have to load those CSV files in two phases, the first time the first half of the columns, in the second phase the other columns.

Perhaps looking for another database that can cope with such a huge number of columns may be better at the end. DB2 nor Oracle allow more columns. MySQL and PostgreSQL seem to be better choices.

Perhaps you should tackle the first step of this problem with something else than with a database. How good is your awk knowledge? Write an awk script that does the splitting of the huge rows in the source CSV files into small normalised datetime, value rows that you can easily load in a database table. Add a DISTINCT to the SELECT to remove all the duplicates, and you're set.
__________________
With kind regards . . . . . SQL Server 2000/2005/2008/2008 R2 Earned beers: 16
Wim
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages
Reply With Quote
Reply

Tags
database, timestamp

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On