If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Row Lineage : Where and how best to store it ?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-11-05, 17:34
arnzie arnzie is offline
Registered User
 
Join Date: Mar 2004
Posts: 30
Row Lineage : Where and how best to store it ?

I am looking to implement row lineage into a new enterprise database, so I am able to track which process, on which date, etc. was responsible for creating that row... Obviously I have a set of audit tables which store the processes which have run, and I was looking to have each row of data, in each operation table, tagged with an ID from that audit table...

First question : I assume the logical place to store this row lineage is as an extra column on each actual table, therefore I can link to the audit information directly from there when querying, if need be... The only other solution I could think of would be to have a seperate 'Data Registry' table, which stores the Audit ID, the table im talking about, and the primary key structure... This seems unnecessary, as it is much easier to store on the actual table, and there are many variations of key structures across all of the tables anyway, so that would be difficult to store... The only advantage of the 2nd solution would be that I do not have metadata stored on each operational table, which I know some data modellers may believe to be the correct way to go... Im definately leaning towards the first option, but was looking for some opinions please ?

Second Question : If I do end up storing the Audit ID on each operational table, obviously I can only store one at a time against each row... But I could have inserted the data from one process, then updated it from another... I am talking about a normalised data warehouse here, so data is rarely overwritten anyway, as history is stored, but in the case that it is, what do you see as being the correct method... I can see the advantages in both, ie: only store the Audit ID that created the data, any updates do not alter the Audit ID, or every time you perform an update to a row, update with the current Audit ID... Any opinions on that please ?

Thanks,
Adam
Reply With Quote
  #2 (permalink)  
Old 01-11-05, 21:02
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
every operation should get its own current audit id, as well as a datetime stamp

typically in this situation, both the audit id and a datetime stamp become part of the primary key, because they "extend" the primary key so as to allow multiple occurrences of a row for the original primary key

it's hard to see this as a separate table, so i would go for your first solution
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 01-11-05, 21:53
arnzie arnzie is offline
Registered User
 
Join Date: Mar 2004
Posts: 30
thanks.... the current design has the date-time stamp is recorded on the actual audit table ( applicable to the job which was responsible ), therefore is not required on each operational data table... i see what you are saying about including the 'audit_id' as part of the primary key, but that interferes with the normal key structure of the table, which states that say a customer may only exist once ... If i start letting my metadata dictate the table / PK structure, it will compromise the reporting from the table, which is of course the primary objective.... as i said before, this is a data warehouse therefore is history enabled, meaning most tables are not updated, but there are a few instances where this is not the case, say the tables holding the important entities within my organisation ( customers, accounts, products, etc. )

regards,
adam
Reply With Quote
  #4 (permalink)  
Old 01-11-05, 23:15
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
right, well, from your original few paragraphs of explanation i though i understood it, but obviously not well enough, so if you do not want to extend a primary key in a data warehouse then of course you can only store one row with each value so i don't see how you can build up an audit history if you can only store one version of each record

good luck with it anyway
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 01-12-05, 00:04
certus certus is offline
Registered User
 
Join Date: Dec 2003
Location: Canada
Posts: 710
Would it be possible to create a separate audit history table that contains a surrogate key for the modification, a primary key of the row modified, a date stamp and a process id?
__________________
visit: relationary
Reply With Quote
  #6 (permalink)  
Old 01-12-05, 01:19
arnzie arnzie is offline
Registered User
 
Join Date: Mar 2004
Posts: 30
you would need to include the table/schema the change applies to in that Audit table also... but the problem is that the primary key structures across all the different tables is different, some are single attribute, some are multi-part, etc... how would i get around that apart from storing them in a single CHAR column ( with each seperated by commas, for example ), and if i did that, i wouldnt be able to join to the original table any more ?

thanks,
adam
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On