Results 1 to 8 of 8
  1. #1
    Join Date
    Dec 2004
    Location
    Madison, WI
    Posts
    3,926

    Unanswered: Storing the Summed value in the table

    I need to bring this topic up as I must say that I am surprised at those people who argue against storing the summed value into a table. Too many times I've read posts on this site from people saying "Never store the summed value". I know this may result in a debate or maybe this isn't the proper place to pose this but I guess you would need to answer this question:

    Why always add values together in the totals report or underlying query? Too many developers don't plan for a large number of records. I for one, don't want to be responsible for a 20 minute report which has to group and sum thousands or millions of records because I've added overhead by having to add the values of 2 or more fields or link in another table to the query. I can already think of your answer (why should I ever plan for that many records?) But the fact that there are developers spreading the concept of never storing the summed value REALLY REALLY scares me! To me it's a no-brainer that you would store the summed value in the main table and I can think of only 1 possible circumstance where you wouldn't store the summed value in a sub-table dealing with pricing. This would be line items but I would STILL store the summed value of the transaction in the main table.

    Take for example, a purchase-orders or transaction-type program. Usually you have a "purchase-order or transaction category" type field stored in the main table. I'm curious as to how many would store the summed value of all the line items in the main table and how many would not. Why would you have to add up all the line items to make a report grouping all purchase orders/transactions by category (or transaction date/etc..) when if you were to simply store the summed value in the main table, you would not have to worry about adding all the line items? Good code re-adds up the line items if changes are entered. I don't look at this as a normalization philosophy but more of a flat table design since when you total items from "pricing-type/costing/transaction" databases, you want to keep the joins of tables to a minimum (but maybe you don't have to deal with multi-users over a network in your databases.) I mean isn't it better to group and sum off of one table verses joining 3-4 tables together? I'm not saying you can't have a normalization structure but why not also store the fields your going to group off of in the main table? If you set up the relationships correctly, you won't have problems. If I design a "totals" report off of 3-4 tables verses 1 table, and someone opens that report (and keeps it open for a long time), which is more likely to give me problems if say I have a few thousand records I'm totalling off of? The report with 1 table in it or the report which has 3-4 tables linked in it. If your answer is neither or they are the same, I've got news for ya.

    If I get an acceptable answer, I'll take it. I am interested in the responses though.
    Last edited by pkstormy; 05-11-07 at 17:15.
    Expert Database Programming
    MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)

  2. #2
    Join Date
    Mar 2003
    Location
    The Bottom of The Barrel
    Posts
    6,102
    Provided Answers: 1
    You're talking about exporting data to a disparate object for reporting purposes and storing aggregates there. We call that a datamart/warehouse, and nobody here frowns on it that I know of, however, it's important to note that a reporting structure is physically independent of your production transaction structure in this scenario.
    oh yeah... documentation... I have heard of that.

    *** What Do You Want In The MS Access Forum? ***

  3. #3
    Join Date
    Feb 2004
    Location
    Chicago, IL
    Posts
    1,312
    I agree with Teddy. There are certain situations that storing calculated data is beneficial. I am doing it myself to boost performance in a particular application. However, I thoroughly understand the risks. For me the risk of the totals not matching the underlying data is not life threatening, therefore I feel the boost in performance is worth any risk of the totals not representing the actual data.

    I think if you understand the risk and build in functionality that will help make sure the totals are always accurate, then it is acceptable. The majority of the questions that appear on this forum regarding storing a total are from novice developers. In that case I will always advise against storing a total. Simply because 80% of the time they are not sure how to do it any differently and their databases are not very large.

    Not many people are using Access with millions of records. I've seen too many posts where people believe that 10,000 records might be stressing Access' capabilities. You on the other hand have been using Access and databases long enough to understand the risk and rewards of this approach.

    Long story short, storing calculated data has its place in database development.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    summary tables are an often used, comfortable, appropriate, efficient, cost beneficial, useful tool in any database developer's toolbox

    i thought everybody used them

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Feb 2004
    Location
    Wales
    Posts
    343
    My I add a comment from a user perspective?

    The storing of summed data does have its place that is clear. However, having come from a large retail background where EPOS is still developing, the original concept was frowned up on as no one saw the practicality of the process. Now, its different. The process whether store in the main system or derived as a sub system is used by department such as marketing and advertising to particulally in "Loyalty Card" enviroment, saving large amounts of calculation time. The answer, in my humble opinion is simple, if there is a rationale for the process - do it, if there isn't advise not to unless otherwise evidenced.

    Systems are large enough these days to cope with this storage methodology

    As stated - Just an opinion!
    Gareth

  6. #6
    Join Date
    Sep 2006
    Posts
    265
    I agree that it can sometimes be advatageous to store totals. if you have a print one purchase order then the gain is negligible. I think it is a balance between restricting the joins to the minimum, reports are often "one liners" and don't need the complete compliment of information you would provide on an entry screen and using your totalling regime.

    My instance where I use totalling is availability of Copies within a Print Edition. This is calculated and totalled into the Edition. Although, there aren't many records it just seems logical approach as you don't need to interogate down at the child level (Copy) to ascertain how many copies are available.

    The most important issue is performance and I would always suffice semantics for better performance.

    Simon

  7. #7
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    like all rules there are good reasons for the rule to be developed, there are often also good reasons to ignore the rule.

    A 'ferisntance' if you take a fundamentalist perspective on stock control on never storing intermediate, derived data or totals then you have to recalcualate every product going back into the dark ages each time you want a current balance or stock query. its silly. if you have an opening stock for a previous period then apply any transactions since that opening stock the volume of work for the server engine (or pulled across the network for JET seeing as we are in the Access forum) drops dramatically, to go back through every transaction since year dot is just a waste of effort. its not needed. both should come up with the same answer, one just takes longer. if its a high volume item with lots of transactions then its going to take a heck of a lot longer.

    its a trade-off between what the theory says, and what the customer needs.
    I'd rather be riding on the Tiger 800 or the Norton

  8. #8
    Join Date
    Dec 2002
    Location
    Préverenges, Switzerland
    Posts
    3,740
    ahh Healdem, i partially agree.

    consider an accounting system.

    sooner or later FY2000 is going to get closed (as in audited and taxed and blah blah), and at that stage it seems obvious that all records prior to FY2001 are frozen solid. once history is frozen, it would be complete madness to dynamically calculate everything since FY1927 to generate a May2001 Balance Sheet.

    hence a close/open process seems logical (saving calculated data!! but at the same time locking the records on which the calculations are based so they cannot be edited without 'un-closing' the year) and current reports are based off prior open + postings since.

    where i disagree with you (maybe) is (in this accounting example) the idea of storing accumulating balances in each account AFTER the prior close. i have one JET file-servered backend that grows to tens of thousands of entries since prior open: it takes a few seconds (5?) for a detailed balance sheet using the prior open + dynamically calculated current approach.

    if i were to store accumulating balances for each account AND allow backdated (in open period) entries, i would either need to recalculate the 'saved' accumulators at each backdated entry or at each report.

    if i recalculated at each report, i'm worse off than not storing saved data since i not only have to SELECT but also to UPDATE.

    if i recalculated at data entry (in effect, all entries are 'backdated' by at least the snail-mail delay in receiving the documents), then i have the recalc delay on each entry and a wholly unresponsive system

    ...so i calculate dynamically with only 'frozen' close/open data saved as calculated values.

    as someone pointed out earlier - this is an Access site so i'm talking about dumb JET backends without clever trigger tricks to do the recalc in the background where the user can't feel the delay.

    izy
    currently using SS 2008R2

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •