Results 1 to 5 of 5
  1. #1
    Join Date
    Mar 2010

    Unanswered: DB structure for massive data inserts (2,5 to 3,2 mil rows per day)


    I'm taking over one solution witch is not working very well and it haven't maintained systematically.
    The plan is to create new table structure/logic and I would appreciate Yours help here.

    Input for planning:
    data from network, what is need to insert into db, comes like:
    "date", "parameter name", "parameter value"
    same parameter name, but might be different value, comes approx. after every 3-5 sec. (197 different parameter names)

    there's existing application, what inserts this data as one row into db.
    that kind of rows comes approx. 2,5 to 3,2 mil. per each day.
    Indexes are built on column date and parameter name.

    Business wants to get every kind of online and calculated data based on this raw data, so we need to insert all sent data.
    And if there are more that one week data in one table, queries are getting very slow. (online queries, what should run like once after 30 sec)
    also calculations are running from cron, to calculate one hour average, one day average, etc data.

    My idea:
    create one table, with same indexes, but hold there only last day + current day data.
    copy every night now-2 day data to another table. then i can keep so called online table max 6 mil rows and then online queries should be very ok.
    second table is for week analyze, also with same indexes, and every night i will copy older than now-7 day data to month table.
    month table would be for room saving purpose without indexes and it's actually only archiving.

    1. how my proposed solution feels?
    2. how would be best way to organize nightly coping process?

    with thanks.

  2. #2
    Join Date
    Dec 2007
    London, UK
    Are the parameter values all sampled at the same frequency and timestamped with the same time? If so then you could presumably have one row for all 197 attributes - which would mean fewer than 30,000 rows per day.

  3. #3
    Join Date
    Mar 2010
    no, data comes from sensors and sequence or even every batch can consist different order and amount of parameters.
    it's like one pipe shooting messages (consist date, and value).
    it have to save same way - 3 columns in one row.

    of course, question is also, is it point to make one more column - rowid or smt similar ?
    is it makes coping easier ?

  4. #4
    Join Date
    Nov 2005
    You are not looking for a RDBMS system, but a times series database. This is a database system that is specially designed to store time series, and only time series, data. You could take a look at PI System, Historis or iHistorian.

  5. #5
    Join Date
    Jul 2003
    San Antonio, TX
    This is very similar to what I am collecting for several customers, - perfmon counters. The way I resolved it is by storing the data into data warehouse structure (star schema), where objecttype, objectname, countername, and date value are stored into dimention tables. Then the corresponding ID's for above dimensions are stored along with counter values into a fact table. Of course, along with minimizing the size of each row by introducing the corresponding ID's for dimensions, you also need to consider data partitioning. And in order to introduce an efficient partitioning strategy based on date values, for example, you may want to bring in an additional field from the time dimension. This may be a month, in which case you may want to partition by month with 12 resulting partitions. Or by day of month, with 31 partitions. Or you may even consider week or even a day number (up to 366 partitions). But if you want to introduce a sliding window, you may have to keep the date field within your fact table, and then have a nightly merge/split/switch operation on the production fact table. Along with that, you may want to have an "aligned" archive table with fewer partitions and 1 overlapping with production partition, so that you can really automate "data aging" process.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts