The table will have approximately 700 different policies, each with approximately 200 records, so approximately 140,000 records in total.
Each month I am going to receive an Excel workbook containing revised information for some of the policies. I am going to need to upload the revised information into the tblPolicyStreams table, overwriting the existing information for those policies. I need to keep a backup of the data before I overwrite it and I will need to be able to run a query so that I can see all of the historical information we have ever had for each policy and be able to compare them side by side to see what has been changed.
So, my choices?
It would seem to be inefficient to take a backup copy of the entire tblPolicyStreams table each time, because only a handful of policies will be changed.
However, if I run a query to create a new table containing the current data for each policy, I am going to end up having a lot of 'historical' tables in the database.
So at the moment I am considering having one 'historic table' - say it's called tblOldPolicyStreams - where I append historical data when a policy is revised. The tblOldPolicyStreams would have an additional field called something like 'BACKUPDATE' so I can tell when the data was changed.
Is that a good way to go about this or am I missing a trick? Anything I should be aware of before diving in?
I have always found it useful to keep historical data. For instance, if you want to print out a copy of an old order for a customer, you will want the details of the salesman who took the order shown. This can be difficult if the salesman has left and someone else is looking after the customer.
What I do is this (adapted for your situation):
All of my records use Autonumber keys, so they last forever.
I would have two tables. The parent table would have unchanging policy data (usually customer-related stuff, but depending on how volatile your customer data is, it might be no more than the policy number and the date of first issue.
All of the other data goes into a child table, with multiple records for each policy. Each of the records carries a creation date and a checkbox which indicates which record is current.
The editing form shows the header record details and the current child record details. It also has a listbox with two values: CURRENT and HISTORICAL. The former is the default and ensures that the current data is displayed. If you select the latter, a list box, normally hidden becomes visible and shows the date of issue of the various amendments. Selecting one of these, causes the selected version to be displayed.
Very importantly, there is a cautionary message warning that the record on display is not current.
There is another listbox with two values: NEW and UPDATE. This determines whether a new current record is created(taking over the checkbox tick), or you are simply correcting an error. Note that only the current record can be updated. Once a record has lost the current checkbox tick, it cannot be modified thereafter. After all we don't want people rewriting history, do we !
I hope this stuff gives you a few ideas.
This should give you some ideas: Slowly changing dimension - Wikipedia, the free encyclopedia
I use type 4 - simple, straightforward, normalised, standard (follow up the link to change data capture)
I would not use your first two options.
I would also include a column in the history table indicating what happened to the row (i.e. DELETED, INSERTED, UPDATED).
This sort of logging is most easily done by RDBMSs with either triggers or inbuilt change capture functionality; unfortunately Access\JET\ACE have neither so you have to roll your own