05-11-07, 15:58 #1Moderator
- Join Date
- Dec 2004
- Madison, WI
Unanswered: Storing the Summed value in the table
I need to bring this topic up as I must say that I am surprised at those people who argue against storing the summed value into a table. Too many times I've read posts on this site from people saying "Never store the summed value". I know this may result in a debate or maybe this isn't the proper place to pose this but I guess you would need to answer this question:
Why always add values together in the totals report or underlying query? Too many developers don't plan for a large number of records. I for one, don't want to be responsible for a 20 minute report which has to group and sum thousands or millions of records because I've added overhead by having to add the values of 2 or more fields or link in another table to the query. I can already think of your answer (why should I ever plan for that many records?) But the fact that there are developers spreading the concept of never storing the summed value REALLY REALLY scares me! To me it's a no-brainer that you would store the summed value in the main table and I can think of only 1 possible circumstance where you wouldn't store the summed value in a sub-table dealing with pricing. This would be line items but I would STILL store the summed value of the transaction in the main table.
Take for example, a purchase-orders or transaction-type program. Usually you have a "purchase-order or transaction category" type field stored in the main table. I'm curious as to how many would store the summed value of all the line items in the main table and how many would not. Why would you have to add up all the line items to make a report grouping all purchase orders/transactions by category (or transaction date/etc..) when if you were to simply store the summed value in the main table, you would not have to worry about adding all the line items? Good code re-adds up the line items if changes are entered. I don't look at this as a normalization philosophy but more of a flat table design since when you total items from "pricing-type/costing/transaction" databases, you want to keep the joins of tables to a minimum (but maybe you don't have to deal with multi-users over a network in your databases.) I mean isn't it better to group and sum off of one table verses joining 3-4 tables together? I'm not saying you can't have a normalization structure but why not also store the fields your going to group off of in the main table? If you set up the relationships correctly, you won't have problems. If I design a "totals" report off of 3-4 tables verses 1 table, and someone opens that report (and keeps it open for a long time), which is more likely to give me problems if say I have a few thousand records I'm totalling off of? The report with 1 table in it or the report which has 3-4 tables linked in it. If your answer is neither or they are the same, I've got news for ya.
If I get an acceptable answer, I'll take it. I am interested in the responses though.
Last edited by pkstormy; 05-11-07 at 16:15.Expert Database Programming
MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)
05-11-07, 16:59 #2Purveyor of Discontent
Provided Answers: 1
- Join Date
- Mar 2003
- The Bottom of The Barrel
You're talking about exporting data to a disparate object for reporting purposes and storing aggregates there. We call that a datamart/warehouse, and nobody here frowns on it that I know of, however, it's important to note that a reporting structure is physically independent of your production transaction structure in this scenario.oh yeah... documentation... I have heard of that.
*** What Do You Want In The MS Access Forum? ***
05-11-07, 18:05 #3Registered User
- Join Date
- Feb 2004
- Chicago, IL
I agree with Teddy. There are certain situations that storing calculated data is beneficial. I am doing it myself to boost performance in a particular application. However, I thoroughly understand the risks. For me the risk of the totals not matching the underlying data is not life threatening, therefore I feel the boost in performance is worth any risk of the totals not representing the actual data.
I think if you understand the risk and build in functionality that will help make sure the totals are always accurate, then it is acceptable. The majority of the questions that appear on this forum regarding storing a total are from novice developers. In that case I will always advise against storing a total. Simply because 80% of the time they are not sure how to do it any differently and their databases are not very large.
Not many people are using Access with millions of records. I've seen too many posts where people believe that 10,000 records might be stressing Access' capabilities. You on the other hand have been using Access and databases long enough to understand the risk and rewards of this approach.
Long story short, storing calculated data has its place in database development.
05-11-07, 20:44 #4SQL Consultant
- Join Date
- Apr 2002
- Toronto, Canada
05-12-07, 08:15 #5Registered User
- Join Date
- Feb 2004
My I add a comment from a user perspective?
The storing of summed data does have its place that is clear. However, having come from a large retail background where EPOS is still developing, the original concept was frowned up on as no one saw the practicality of the process. Now, its different. The process whether store in the main system or derived as a sub system is used by department such as marketing and advertising to particulally in "Loyalty Card" enviroment, saving large amounts of calculation time. The answer, in my humble opinion is simple, if there is a rationale for the process - do it, if there isn't advise not to unless otherwise evidenced.
Systems are large enough these days to cope with this storage methodology
As stated - Just an opinion!
05-14-07, 08:25 #6Registered User
- Join Date
- Sep 2006
I agree that it can sometimes be advatageous to store totals. if you have a print one purchase order then the gain is negligible. I think it is a balance between restricting the joins to the minimum, reports are often "one liners" and don't need the complete compliment of information you would provide on an entry screen and using your totalling regime.
My instance where I use totalling is availability of Copies within a Print Edition. This is calculated and totalled into the Edition. Although, there aren't many records it just seems logical approach as you don't need to interogate down at the child level (Copy) to ascertain how many copies are available.
The most important issue is performance and I would always suffice semantics for better performance.
05-14-07, 08:35 #7Jaded Developer
Provided Answers: 59
- Join Date
- Nov 2004
- out on a limb
like all rules there are good reasons for the rule to be developed, there are often also good reasons to ignore the rule.
A 'ferisntance' if you take a fundamentalist perspective on stock control on never storing intermediate, derived data or totals then you have to recalcualate every product going back into the dark ages each time you want a current balance or stock query. its silly. if you have an opening stock for a previous period then apply any transactions since that opening stock the volume of work for the server engine (or pulled across the network for JET seeing as we are in the Access forum) drops dramatically, to go back through every transaction since year dot is just a waste of effort. its not needed. both should come up with the same answer, one just takes longer. if its a high volume item with lots of transactions then its going to take a heck of a lot longer.
its a trade-off between what the theory says, and what the customer needs.I'd rather be riding on the Tiger 800 or the Norton
05-14-07, 14:43 #8Cavalier King Charles
- Join Date
- Dec 2002
- Préverenges, Switzerland
ahh Healdem, i partially agree.
consider an accounting system.
sooner or later FY2000 is going to get closed (as in audited and taxed and blah blah), and at that stage it seems obvious that all records prior to FY2001 are frozen solid. once history is frozen, it would be complete madness to dynamically calculate everything since FY1927 to generate a May2001 Balance Sheet.
hence a close/open process seems logical (saving calculated data!! but at the same time locking the records on which the calculations are based so they cannot be edited without 'un-closing' the year) and current reports are based off prior open + postings since.
where i disagree with you (maybe) is (in this accounting example) the idea of storing accumulating balances in each account AFTER the prior close. i have one JET file-servered backend that grows to tens of thousands of entries since prior open: it takes a few seconds (5?) for a detailed balance sheet using the prior open + dynamically calculated current approach.
if i were to store accumulating balances for each account AND allow backdated (in open period) entries, i would either need to recalculate the 'saved' accumulators at each backdated entry or at each report.
if i recalculated at each report, i'm worse off than not storing saved data since i not only have to SELECT but also to UPDATE.
if i recalculated at data entry (in effect, all entries are 'backdated' by at least the snail-mail delay in receiving the documents), then i have the recalc delay on each entry and a wholly unresponsive system
...so i calculate dynamically with only 'frozen' close/open data saved as calculated values.
as someone pointed out earlier - this is an Access site so i'm talking about dumb JET backends without clever trigger tricks to do the recalc in the background where the user can't feel the delay.
izycurrently using SS 2008R2