If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Data Record Versioning - How to implement?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-27-09, 17:57
fifieldn fifieldn is offline
Registered User
 
Join Date: Mar 2009
Posts: 10
Smile Data Record Versioning - How to implement?

Hi
I am using SQL Server Express 2008.
I want to implement a DB independent mechanism for versioning data. So for example, if a record is updated I want the original record stored so i am still able to maintain a reference to this orginal data.
Can you recomment a design/mechanism?

I have read this article CodeProject: Record Versioning with SQL Server. Free source code and programming help

However, I know there is a way to maintain the versioned data and original data in one table. Does anyone have information on this approach?

Any others?
I would appreciate some prompt help as I need to get started on an implemention asap.

Thanks
Reply With Quote
  #2 (permalink)  
Old 04-27-09, 19:15
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
I would appreciate some prompt help as I need to get started on an implemention asap
The simplest thing is just to add a last modified time to your main table and then, when ever you update the record, you just insert a new record into the table but with a new modified time. All selects of data from this table will need to make sure they use the record with the latest modified time.

You could also add a live record flag but that would mean having to update the old record to clear the old live flag. You might also want a deleted flag to indicate that the record is now deleted. Another addition might be a field to indicate who updated the record.

Doing things this way does use up space as you have to store the complete record again even if only one field gets altered.

Having a history table and a live data table is often done but is rarely a good solution.

Hope that helps.

Mike
Reply With Quote
  #3 (permalink)  
Old 04-28-09, 12:08
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Don't store the current data and historical data all in your main production table. That is just asking for performance problems against 95% of your queries.
Create separate archive tables for each of the datasets you want to track.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #4 (permalink)  
Old 04-28-09, 14:21
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by blindman
Don't store the current data and historical data all in your main production table. That is just asking for performance problems against 95% of your queries.
Create separate archive tables for each of the datasets you want to track.
Assuming you have reasonable indexes then it should take about the same time to access the data using either method. I'll accept that there'll be a couple more index pages referenced if you're putting all the data into one table but these pages will normally be cached so I don't think the user will notice any difference.

The developer however will notice a huge difference if they have to write code that goes to two tables rather than one. If the table is central to your system then you could easily end up doubling the amount of SQL being written.

Imagine reporting on these tables - you'll need all the same reporting options to apply to two tables. Then you'll add differences like time period options on the history table which won't apply to the current table. Soon you'll find all sorts of other differences creeping in leading to user confusion. Users will also be wondering if MTD reporting includes data from the current table or is it just history data. Having 2 sets of code producing the same reports will lead to more bugs - users will also start asking why there's a difference when reporting on current data and when reporting on the same data a day later.

There's also all the code that transfers the data between the two tables that now has to be written. Then the code that decides which table we're updating or selecting from. This code isn't needed with a single table.

These are all generic points as we don't know how much data we're dealing with, how long the old data is being kept for or how many updates happen on a daily basis. The choice in the end is do you want to create 2 tables and write twice as much code or use 1 table (normalising your code?) and keep roughly the same performance with less code.

Mike
Reply With Quote
  #5 (permalink)  
Old 04-29-09, 08:30
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Don't do it the PeopleSoft way, whatever you do! Or at least how PeopleSoft worked many years ago when I was exposed to it. Each table had an effective_from_date column (can't remember the actual name) but no effective_to_date column, so to get the record for any particular date you had to write something like:
Code:
select ...
from the_table t1
where ...
and effective_from_date =
( select max(effective_from_date)
  from the_table t2
  where t2.key_col = t1.key_col
  and t2.effective_from_date <= :date_variable
);
For a join of data from many tables, you had to repeat this subquery for each one. Not the most efficient of solutions!
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #6 (permalink)  
Old 04-29-09, 11:48
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
There was a similar post on versioning here. At that time I thought it was best to use two dates (as you're suggesting now) but everyone seemed to prefer using a single date field - I won't repeat all the arguments. I guess it's just personal preference in the end.

Last edited by mike_bike_kite; 04-29-09 at 11:52.
Reply With Quote
  #7 (permalink)  
Old 04-29-09, 12:20
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Ah yes, I remember that one. I somehow managed to restrain myself from contributing, but if I had I would have firmly backed you up on the 2 dates solution (though not of course on the EAV references!) From a quick re-read it seems that no one actually objected to the 2 dates, but then again no one apart from you actually advocated them either? Strange.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On