| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

01-11-05, 17:34
|
|
Registered User
|
|
Join Date: Mar 2004
Posts: 30
|
|
|
Row Lineage : Where and how best to store it ?
|
|
I am looking to implement row lineage into a new enterprise database, so I am able to track which process, on which date, etc. was responsible for creating that row... Obviously I have a set of audit tables which store the processes which have run, and I was looking to have each row of data, in each operation table, tagged with an ID from that audit table...
First question : I assume the logical place to store this row lineage is as an extra column on each actual table, therefore I can link to the audit information directly from there when querying, if need be... The only other solution I could think of would be to have a seperate 'Data Registry' table, which stores the Audit ID, the table im talking about, and the primary key structure... This seems unnecessary, as it is much easier to store on the actual table, and there are many variations of key structures across all of the tables anyway, so that would be difficult to store... The only advantage of the 2nd solution would be that I do not have metadata stored on each operational table, which I know some data modellers may believe to be the correct way to go... Im definately leaning towards the first option, but was looking for some opinions please ?
Second Question : If I do end up storing the Audit ID on each operational table, obviously I can only store one at a time against each row... But I could have inserted the data from one process, then updated it from another... I am talking about a normalised data warehouse here, so data is rarely overwritten anyway, as history is stored, but in the case that it is, what do you see as being the correct method... I can see the advantages in both, ie: only store the Audit ID that created the data, any updates do not alter the Audit ID, or every time you perform an update to a row, update with the current Audit ID... Any opinions on that please ?
Thanks,
Adam
|
|

01-11-05, 21:02
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
every operation should get its own current audit id, as well as a datetime stamp
typically in this situation, both the audit id and a datetime stamp become part of the primary key, because they "extend" the primary key so as to allow multiple occurrences of a row for the original primary key
it's hard to see this as a separate table, so i would go for your first solution
|
|

01-11-05, 21:53
|
|
Registered User
|
|
Join Date: Mar 2004
Posts: 30
|
|
|
|
thanks.... the current design has the date-time stamp is recorded on the actual audit table ( applicable to the job which was responsible ), therefore is not required on each operational data table... i see what you are saying about including the 'audit_id' as part of the primary key, but that interferes with the normal key structure of the table, which states that say a customer may only exist once ... If i start letting my metadata dictate the table / PK structure, it will compromise the reporting from the table, which is of course the primary objective.... as i said before, this is a data warehouse therefore is history enabled, meaning most tables are not updated, but there are a few instances where this is not the case, say the tables holding the important entities within my organisation ( customers, accounts, products, etc. )
regards,
adam
|
|

01-11-05, 23:15
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
right, well, from your original few paragraphs of explanation i though i understood it, but obviously not well enough, so if you do not want to extend a primary key in a data warehouse then of course you can only store one row with each value so i don't see how you can build up an audit history if you can only store one version of each record
good luck with it anyway
|
|

01-12-05, 00:04
|
|
Registered User
|
|
Join Date: Dec 2003
Location: Canada
Posts: 710
|
|
Would it be possible to create a separate audit history table that contains a surrogate key for the modification, a primary key of the row modified, a date stamp and a process id?
|
|

01-12-05, 01:19
|
|
Registered User
|
|
Join Date: Mar 2004
Posts: 30
|
|
you would need to include the table/schema the change applies to in that Audit table also... but the problem is that the primary key structures across all the different tables is different, some are single attribute, some are multi-part, etc... how would i get around that apart from storing them in a single CHAR column ( with each seperated by commas, for example ), and if i did that, i wouldnt be able to join to the original table any more ?
thanks,
adam
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|