Results 1 to 7 of 7
  1. #1
    Join Date
    Mar 2009
    Posts
    10

    Smile Data Record Versioning - How to implement?

    Hi
    I am using SQL Server Express 2008.
    I want to implement a DB independent mechanism for versioning data. So for example, if a record is updated I want the original record stored so i am still able to maintain a reference to this orginal data.
    Can you recomment a design/mechanism?

    I have read this article CodeProject: Record Versioning with SQL Server. Free source code and programming help

    However, I know there is a way to maintain the versioned data and original data in one table. Does anyone have information on this approach?

    Any others?
    I would appreciate some prompt help as I need to get started on an implemention asap.

    Thanks

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I would appreciate some prompt help as I need to get started on an implemention asap
    The simplest thing is just to add a last modified time to your main table and then, when ever you update the record, you just insert a new record into the table but with a new modified time. All selects of data from this table will need to make sure they use the record with the latest modified time.

    You could also add a live record flag but that would mean having to update the old record to clear the old live flag. You might also want a deleted flag to indicate that the record is now deleted. Another addition might be a field to indicate who updated the record.

    Doing things this way does use up space as you have to store the complete record again even if only one field gets altered.

    Having a history table and a live data table is often done but is rarely a good solution.

    Hope that helps.

    Mike

  3. #3
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Don't store the current data and historical data all in your main production table. That is just asking for performance problems against 95% of your queries.
    Create separate archive tables for each of the datasets you want to track.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by blindman
    Don't store the current data and historical data all in your main production table. That is just asking for performance problems against 95% of your queries.
    Create separate archive tables for each of the datasets you want to track.
    Assuming you have reasonable indexes then it should take about the same time to access the data using either method. I'll accept that there'll be a couple more index pages referenced if you're putting all the data into one table but these pages will normally be cached so I don't think the user will notice any difference.

    The developer however will notice a huge difference if they have to write code that goes to two tables rather than one. If the table is central to your system then you could easily end up doubling the amount of SQL being written.

    Imagine reporting on these tables - you'll need all the same reporting options to apply to two tables. Then you'll add differences like time period options on the history table which won't apply to the current table. Soon you'll find all sorts of other differences creeping in leading to user confusion. Users will also be wondering if MTD reporting includes data from the current table or is it just history data. Having 2 sets of code producing the same reports will lead to more bugs - users will also start asking why there's a difference when reporting on current data and when reporting on the same data a day later.

    There's also all the code that transfers the data between the two tables that now has to be written. Then the code that decides which table we're updating or selecting from. This code isn't needed with a single table.

    These are all generic points as we don't know how much data we're dealing with, how long the old data is being kept for or how many updates happen on a daily basis. The choice in the end is do you want to create 2 tables and write twice as much code or use 1 table (normalising your code?) and keep roughly the same performance with less code.

    Mike

  5. #5
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Don't do it the PeopleSoft way, whatever you do! Or at least how PeopleSoft worked many years ago when I was exposed to it. Each table had an effective_from_date column (can't remember the actual name) but no effective_to_date column, so to get the record for any particular date you had to write something like:
    Code:
    select ...
    from the_table t1
    where ...
    and effective_from_date =
    ( select max(effective_from_date)
      from the_table t2
      where t2.key_col = t1.key_col
      and t2.effective_from_date <= :date_variable
    );
    For a join of data from many tables, you had to repeat this subquery for each one. Not the most efficient of solutions!

  6. #6
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    There was a similar post on versioning here. At that time I thought it was best to use two dates (as you're suggesting now) but everyone seemed to prefer using a single date field - I won't repeat all the arguments. I guess it's just personal preference in the end.
    Last edited by mike_bike_kite; 04-29-09 at 12:52.

  7. #7
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Ah yes, I remember that one. I somehow managed to restrain myself from contributing, but if I had I would have firmly backed you up on the 2 dates solution (though not of course on the EAV references!) From a quick re-read it seems that no one actually objected to the 2 dates, but then again no one apart from you actually advocated them either? Strange.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •