If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Data versioning on key entities?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 08-31-08, 04:43
davoutuk davoutuk is offline
Registered User
 
Join Date: Aug 2008
Posts: 9
Data versioning on key entities?

I'm looking for some help on how to design a schema that covers data versioning on a database entity.

For example: there is a requirement for a database to store details of a company's products for all staff to view. The 'product' entity comprises of a main product table and multiple subordinate data tables that hold various properties and sub properties of a product's profile. In otherwords there are multiple linked tables that in total hold all of the data associated with a product.

This product information has to move through a authoring, submit, review and approve cycle. Eventualy when the product info is approved it is available for all users to see.

The product information authors can subsequently post updates/revisions to the original product data. Yet whilst the updates are being submitted and approved the original product data has to be available for all users to view. Hence, the database schema must be able to simultaneously store the original product info as well as the updated info which is going thru the authoring, submit and approve cycle. Eventually when the revisions are approved the original data is replaced with the revised data, so that all users now only see the revised data.

In thinking of hiow to design a scheme to address this requirement I am split between using...
* Separate tables to hold the public approved data and the yet to be approved revision data; or
* Having all the data in a single table with a sub key to identify public versus private

Thoughts?
Reply With Quote
  #2 (permalink)  
Old 08-31-08, 05:53
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
use the same tables with a status code
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 09-01-08, 10:36
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
If you do that you are going to want a materialized view of each table with the STATUS filter already applied.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #4 (permalink)  
Old 09-01-08, 10:41
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
do they have to be materialized???
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 09-01-08, 16:58
ronnyy ronnyy is offline
Registered User
 
Join Date: Feb 2008
Posts: 43
I would use two different tables for the already authorised data, and for the unauthorised data.

I tried to think of some serious reasons, but I couldn't. I belive it's experience.

For example, in the scneario with a Status field, I assume you won't be able to have a real primary key(different then a surogate key), because both your authorised and non-authorised data will have the same id.

For me it is a much more clear separation in two tables, then one Status Code.
If performance is very important, I'll recomend having maybe one table for the authorised data, one for the unauthorised data, and if you want to keep the history of the changes, another table for the history.

But as I said, I can't give you strong reasons for my choice.
I have applied this with a DBMS that is not a major one, therefore performance of the DBMS was not very high.
Reply With Quote
  #6 (permalink)  
Old 09-01-08, 20:58
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Having separate tables for values of "approved" doesn't make any more sense than having separate tables for the values of "product name" to me. The approved status is an attribute, just like name. I'd only have one table.

-PatP
Reply With Quote
  #7 (permalink)  
Old 09-01-08, 23:32
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Quote:
Originally Posted by r937
do they have to be materialized???
Yes. He will need the views to be indexed or he won't get any benefit from them.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #8 (permalink)  
Old 09-02-08, 04:28
ronnyy ronnyy is offline
Registered User
 
Join Date: Feb 2008
Posts: 43
Quote:
Originally Posted by Pat Phelan
Having separate tables for values of "approved" doesn't make any more sense than having separate tables for the values of "product name" to me. The approved status is an attribute, just like name. I'd only have one table.
-PatP
If you have them all together in the same table, you will have to do all the time, two different selects to get the one's with the flag set, and the ones without the flag set. You will have to pay attention, in the client application to allways remove the not aproved ones, and all the time a new functionality will be implemented, a new report...you shouldn't forget to select based on the STATUS flag, which gives problems later on for maintainance, for other developers that will come, not knowing what that Status flag means.

If you have 20 mil records aproved (and 10.000 not aproved) all the time you'll want to select either the aproved one, or not, you'll have to go trough all the records. I don't see it liekely to have the case when you'll select both aproved and not aproved records in the same SELECT.

If you have another table for anuthorized(unapproved) you could add there, the status of the unauthorised record, it can vary(some user just wanted to pause his work, another record is on the second level of authorisation, another one, just couldn't be processed from a automatic process.

What I mean, is that in the unauthorised table you can have much more detailed status, and you can hold the information strictly related to the authorisation, which in the one table scenario will all have to be added to the main (approved and unapproved) table.
And with two tables you do not lose performance, from my point of view, you gain performance.

Basically , I do not like Status fields, they tend to be overused, you will want to add another type of Status Code, and that just adds to the complexity and makes it unclear, and not to simplicity. The Authorisation Flag is not the same as a flag "In stock: Yes/No"

Last edited by ronnyy; 09-02-08 at 15:30.
Reply With Quote
  #9 (permalink)  
Old 09-02-08, 08:20
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Quote:
Originally Posted by blindman
Yes. He will need the views to be indexed or he won't get any benefit from them.
Not necessarily, surely? Regular views still benefit from the indexes on the underlying tables.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #10 (permalink)  
Old 09-02-08, 10:22
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Most of his OLTP queries will be done against "Approved" records. To avoid scanning for these records repeatedly (being Boolean they will not index very well) he needs to create an indexed view displaying only those records.
This has nothing to do with the indexes on the underlying table.

davoutuk, what RDBMS are you planning on using?
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #11 (permalink)  
Old 09-02-08, 13:56
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
I would question "needs to", really. If the performance is fine without the materialized views, why create them? I would only consider them if performance was found to be an issue.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #12 (permalink)  
Old 07-07-09, 15:01
davoutuk davoutuk is offline
Registered User
 
Join Date: Aug 2008
Posts: 9
Thanks for all the replies... Here's some more info on the problem...

I'm building a system to capture staff skills/expertise.

The basic entity hierarchy is something like...
  • A staff member is represented by an 'expert'
  • An expert can have one or more areas of 'expertise'
  • An expertise can have one or more episodes that record the expert's experience
There will be an approval process where a staff member's data changes/additions to any one of these levels has to be approved by their manager.

Hence, the database must allow for...
  • a version of approved data that is visible for public view, and
  • a possible 'WIP' version of the same entity that is not available to the public and awaiting approval from the manager.
  • This approved/WIP duplication can exist at all three expert/expertise/episode levels.

Thoughts?

My underlying DB is MySQL 5.x The design is aimed to address organizations with a headcount of up to 10,000.

Last edited by davoutuk; 07-08-09 at 04:25.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On