Welcome to the dBforums forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions, articles and access our other FREE features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload your own photos and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact support.

If you prefer not to see double-underlined words and corresponding ads, place your cursor
here for ContentLink opt out.

Go Back  dBforums > General > Database Concepts & Design > Free time series DBMS

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-09-07, 13:40
amm amm is offline
Registered User
 
Join Date: Oct 2007
Posts: 5
Free time series DBMS

Hi there,

I am currently involved in a project dealing with huge amounts of very simply structured data in the form of:

- a unix time-stamp
- 2 byte measurement data
- 2 byte meta information

The thing that's so special here is, that there are no delete or update operations involved at all. The only thing the DBMS has to provide is an extremely efficient way to append data, get access to the latest data-point and to calculate some statistics like arithmetic mean, modal values, minima/maxima.

So I was wondering, if a time series DBMS is actually the way to go, or if we should just go ahead and develop a home-brewn solution.

I also thought about using a general purpose DBMS like Mysql or Postgres, but I am having concerns with overhead and that it just will not be able to provide the data fast enough.

cheers, andreas
Reply With Quote
  #2 (permalink)  
Old 10-09-07, 14:22
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,359
Can you guarantee that the time stamp will be unique? or that a combination of timestamp/meta information will be unique?

Please quantify "Huge Amounts"... What is huge to you, may not be large to someone else...

Please quantify "fast enough" ... i.e. Insert rates, retrieval rates, minimum response time, etc.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #3 (permalink)  
Old 10-09-07, 14:33
amm amm is offline
Registered User
 
Join Date: Oct 2007
Posts: 5
Quote:
Originally Posted by loquin
Can you guarantee that the time stamp will be unique? or that a combination of timestamp/meta information will be unique?

all combinations of timestamp/meta information will be unique.

Quote:
Originally Posted by loquin
Please quantify "Huge Amounts"... What is huge to you, may not be large to someone else...

I'd estimate that there will be about one insert every two to three seconds.

Quote:
Originally Posted by loquin
Please quantify "fast enough" ... i.e. Insert rates, retrieval rates, minimum response time, etc.

Insert rates aren't that important (as long as they don't block selects on statistics), what really counts is the access to the last data-point (again depending on timestamp/meta information) and fast access to statistics.
Reply With Quote
  #4 (permalink)  
Old 10-09-07, 14:49
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,359
Quote:
Originally Posted by amm
what really counts is the access to the last data-point (again depending on timestamp/meta information) and fast access to statistics.
define "fast"...
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #5 (permalink)  
Old 10-09-07, 14:55
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,359
Note: The amount of data you describe is on the order of 15 million data points (and apx. 200 MB) per year - not 'huge' by any stretch. The index to support the unique/primary key you describe would almost double this. The database size to support this would not be excessive for any server-based product. Even the Express versions of SQL Server, Oracle, etc., would hold 5 years worth of data in a single database.

PostgreSQL or MySQL should work fine for this app. Assuming you don't need millisecond response times for reading the data.

I can't speak for MySQL, but PostgreSQL supports partitioned tables, which act to break up a single logical table structure into multiple, smaller physical tables, improving retrieval response times when single tables get too large (when they won't fit into available server RAM) in time critical applications.

How many clients will be retrieving data from the server at any one time?


Note - if you're really concerned about it, download the latest version of PostgreSQL (it's totally free,) install it, build a testbed database with the table/PK you describe, and populate it with a year or two of test data. Then, run tests to see how well it responds...

I tested postgresql last year with a similar, simple table structure, looking at insert speeds; I was pushing 1500 inserts a second into the table, and was retrieving data simultaneously (1-1000 records) with no apparent reduction in insertion rate.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert


Last edited by loquin : 10-09-07 at 16:02.
Reply With Quote
  #6 (permalink)  
Old 10-09-07, 15:00
amm amm is offline
Registered User
 
Join Date: Oct 2007
Posts: 5
Quote:
Originally Posted by loquin
define "fast"...

there's no definitive time-out limit, but the values should be delivered in about 0.1 to 0.5 seconds.
Reply With Quote
  #7 (permalink)  
Old 10-09-07, 15:12
amm amm is offline
Registered User
 
Join Date: Oct 2007
Posts: 5
Quote:
Originally Posted by loquin
Note: The amount of data you describe is on the order of 200 MB per year - not 'huge' by any stretch. The index to support the unique/primary key you describe would almost double this. The database size to support this would not be excessive for any server-based product. Even the Express versions of SQL Server, Oracle, etc., would hold 5 years worth of data in a single database.

PostgreSQL or MySQL should work fine for this app. Assuming you don't need millisecond response times for reading the data.

Thanks a lot for the suggestions. I will give PostgreSQL a try and if that doesn't work out, I'll check out MySQL.

Quote:
Originally Posted by loquin
I can't speak for MySQL, but PostgreSQL supports partitioned tables, which act to break up a single logical table structure into multiple, smaller physical tables, improving retrieval response times when single tables get too large (when they won't fit into available server RAM) in time critical applications.

How many clients will be retrieving data from the server at any one time?

The absolute worst case would be about 50 clients at the same time.
Reply With Quote
  #8 (permalink)  
Old 10-09-07, 15:24
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 9,291
Quote:
Originally Posted by loquin
What is huge to you, may not be large to someone else...
Always an embarrassing realization.
__________________
If it's not practically useful, then it's practically useless.

blindman
http://sqlblindman.googlepages.com/main
Reply With Quote
  #9 (permalink)  
Old 10-09-07, 16:04
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,359
Quote:
Originally Posted by blindman
Always an embarrassing realization.
these personal situations sometime can be...
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #10 (permalink)  
Old 10-09-07, 18:35
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,359
amm:

with the statistical values be for the total number of records, or for a subset of the records, and if a subset, will the subset usually be time-based? (records for the last 24 hours, for instance...)
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #11 (permalink)  
Old 10-11-07, 18:07
amm amm is offline
Registered User
 
Join Date: Oct 2007
Posts: 5
Quote:
Originally Posted by loquin
amm:

with the statistical values be for the total number of records, or for a subset of the records, and if a subset, will the subset usually be time-based? (records for the last 24 hours, for instance...)

statistics need to be calculated for:
  • all available data
  • data from the last day
  • the last week
  • the last month
  • the last year

at the moment we only intend to use some basic descriptive statistics like the arithmetic mean, modal values, minima, maxima, ...

there may be need to do regression analysis later on, but currently the main focus lies on descriptive statistics.
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On