Results 1 to 9 of 9
  1. #1
    Join Date
    Mar 2008
    Posts
    5

    Unanswered: 1 millions records

    Hello,
    What would be the best way to avg() 1 millions records?
    I am worried about performance
    Should I just write the SQL in my applications code, or should I write a stored procedure?

    I need to average out data to build charts.


    I am a little new to postgreSQL so any help would be appreciated.

    drew

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    how many times do you want to do it?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Mar 2008
    Posts
    5
    I am using PHP on the web. so its as many times people access the page.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    how much volatility in the million rows? are they ever updated at all? because if not, then you would calculate the average and store it somewhere

    are most of the rows going to be static? are the rows added by date, and a new average needed after every day?

    see, you've given so little information, what are we supposed to tell you?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Mar 2008
    Posts
    5
    Sorry i gave little information.
    here is the way the system works.

    Data is added to the DB every second of the day and is storing electricity consumption (i.e watts, volts).

    I would like to build charts on things like , average usage per year, month, day etc.....

    Every day there is 86400 records added, its not a million, but over times it can be huge.
    ... date and time is store in each row

    I was thinking of maybe writing the average into a different table every day at 23:59:59 and using that, but if there is a better idea it would appreciate it.

    Drew

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    i would begin by writing simple SQL queries

    don't attempt to design optimization into the data structure until you have a performance problem

    make sure you have a datetime column, and in order to extract stats for current month, etc., use a range test such as this --
    Code:
    SELECT AVG(something) AS avg_something
      FROM daTable
     WHERE dateadded >= '2008-02-01'
       AND dateadded  < '2008-03-01'
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Quote Originally Posted by r937
    make sure you have a datetime column, and in order to extract stats for current month
    And make sure you have an index on that datetime column

  8. #8
    Join Date
    Apr 2008
    Posts
    29
    If all you have to do is calculate the average daily, weekly, and so on... you might want to store those values in another table and devise some stored procedures to do the calculation once...

  9. #9
    Join Date
    May 2008
    Posts
    5
    In fact, if all you are doing is averaging the data you might consider storing it in groups i.e. saving the average per minute would cut your storage requirement by a factor of ~60

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •