Results 1 to 12 of 12
  1. #1
    Join Date
    Aug 2008
    Posts
    9

    Data versioning on key entities?

    I'm looking for some help on how to design a schema that covers data versioning on a database entity.

    For example: there is a requirement for a database to store details of a company's products for all staff to view. The 'product' entity comprises of a main product table and multiple subordinate data tables that hold various properties and sub properties of a product's profile. In otherwords there are multiple linked tables that in total hold all of the data associated with a product.

    This product information has to move through a authoring, submit, review and approve cycle. Eventualy when the product info is approved it is available for all users to see.

    The product information authors can subsequently post updates/revisions to the original product data. Yet whilst the updates are being submitted and approved the original product data has to be available for all users to view. Hence, the database schema must be able to simultaneously store the original product info as well as the updated info which is going thru the authoring, submit and approve cycle. Eventually when the revisions are approved the original data is replaced with the revised data, so that all users now only see the revised data.

    In thinking of hiow to design a scheme to address this requirement I am split between using...
    * Separate tables to hold the public approved data and the yet to be approved revision data; or
    * Having all the data in a single table with a sub key to identify public versus private

    Thoughts?

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    use the same tables with a status code
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    If you do that you are going to want a materialized view of each table with the STATUS filter already applied.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    do they have to be materialized???
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Feb 2008
    Posts
    43
    I would use two different tables for the already authorised data, and for the unauthorised data.

    I tried to think of some serious reasons, but I couldn't. I belive it's experience.

    For example, in the scneario with a Status field, I assume you won't be able to have a real primary key(different then a surogate key), because both your authorised and non-authorised data will have the same id.

    For me it is a much more clear separation in two tables, then one Status Code.
    If performance is very important, I'll recomend having maybe one table for the authorised data, one for the unauthorised data, and if you want to keep the history of the changes, another table for the history.

    But as I said, I can't give you strong reasons for my choice.
    I have applied this with a DBMS that is not a major one, therefore performance of the DBMS was not very high.

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Having separate tables for values of "approved" doesn't make any more sense than having separate tables for the values of "product name" to me. The approved status is an attribute, just like name. I'd only have one table.

    -PatP

  7. #7
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Quote Originally Posted by r937
    do they have to be materialized???
    Yes. He will need the views to be indexed or he won't get any benefit from them.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  8. #8
    Join Date
    Feb 2008
    Posts
    43
    Quote Originally Posted by Pat Phelan
    Having separate tables for values of "approved" doesn't make any more sense than having separate tables for the values of "product name" to me. The approved status is an attribute, just like name. I'd only have one table.
    -PatP
    If you have them all together in the same table, you will have to do all the time, two different selects to get the one's with the flag set, and the ones without the flag set. You will have to pay attention, in the client application to allways remove the not aproved ones, and all the time a new functionality will be implemented, a new report...you shouldn't forget to select based on the STATUS flag, which gives problems later on for maintainance, for other developers that will come, not knowing what that Status flag means.

    If you have 20 mil records aproved (and 10.000 not aproved) all the time you'll want to select either the aproved one, or not, you'll have to go trough all the records. I don't see it liekely to have the case when you'll select both aproved and not aproved records in the same SELECT.

    If you have another table for anuthorized(unapproved) you could add there, the status of the unauthorised record, it can vary(some user just wanted to pause his work, another record is on the second level of authorisation, another one, just couldn't be processed from a automatic process.

    What I mean, is that in the unauthorised table you can have much more detailed status, and you can hold the information strictly related to the authorisation, which in the one table scenario will all have to be added to the main (approved and unapproved) table.
    And with two tables you do not lose performance, from my point of view, you gain performance.

    Basically , I do not like Status fields, they tend to be overused, you will want to add another type of Status Code, and that just adds to the complexity and makes it unclear, and not to simplicity. The Authorisation Flag is not the same as a flag "In stock: Yes/No"
    Last edited by ronnyy; 09-02-08 at 16:30.

  9. #9
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Quote Originally Posted by blindman
    Yes. He will need the views to be indexed or he won't get any benefit from them.
    Not necessarily, surely? Regular views still benefit from the indexes on the underlying tables.

  10. #10
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Most of his OLTP queries will be done against "Approved" records. To avoid scanning for these records repeatedly (being Boolean they will not index very well) he needs to create an indexed view displaying only those records.
    This has nothing to do with the indexes on the underlying table.

    davoutuk, what RDBMS are you planning on using?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  11. #11
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    I would question "needs to", really. If the performance is fine without the materialized views, why create them? I would only consider them if performance was found to be an issue.

  12. #12
    Join Date
    Aug 2008
    Posts
    9
    Thanks for all the replies... Here's some more info on the problem...

    I'm building a system to capture staff skills/expertise.

    The basic entity hierarchy is something like...
    • A staff member is represented by an 'expert'
    • An expert can have one or more areas of 'expertise'
    • An expertise can have one or more episodes that record the expert's experience

    There will be an approval process where a staff member's data changes/additions to any one of these levels has to be approved by their manager.

    Hence, the database must allow for...
    • a version of approved data that is visible for public view, and
    • a possible 'WIP' version of the same entity that is not available to the public and awaiting approval from the manager.
    • This approved/WIP duplication can exist at all three expert/expertise/episode levels.


    Thoughts?

    My underlying DB is MySQL 5.x The design is aimed to address organizations with a headcount of up to 10,000.
    Last edited by davoutuk; 07-08-09 at 05:25.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •