I'm designing a data warehouse. The key object that links all the databases is an employee. An employee has the following attributes: name, username, phone number, manager. All the attributes are important and I have to track their changes.
E.g. we have telephony database, and there phone number is the key, so in DW I have to link the fact of the call with proper employee.
Then, consider the followin example. The manager of the employee changed from M1 to M2 effective of T1. If I want to check employees perofrmance in general, then I don't care who is his manager. But if the manager M2 wants to track KPIs of his team, then we should include the employee into the result set strating from T1. And so on...
All the data is volatile, so I decided to do the following:
1. Create the table IDENTITY (id PK); it has ONLY ONE column, that is it. It has only one purpose - to keep changes of employee master data linked together.
2. Create the table EMPLOYEES (id PK; identity_id FK REF IDENTITY.id; name; username; phone; mamanger_id FK REF EMPLOYEES.id, start_date, end_date).
Now consider another example. Irina is working in the company for 5 years. We have the following rows in EMPLOYEES table:
As you can see there are the following changes:
1) Last name from Ivanova to Petrova;
2) Account name from SPBIVANOI to MONPETROI;
3) Phone number from 1111111 to 2222222;
4) Manager from 51 to 42.
But identity_id column (the second one, = 3) keep all the changes linked.
I need your opinion regarding this architecture. Does it make sence? Do you know any other approaches on how to keep high-volatile master data in a DW in consistent state?
Type 4, from that article, is the absolute only method I would ever use and trust.
One table that holds current values, and an archive table that holds all historical values (INCLUDING the current value!).
The majority of your queries will run quickly against the current values table, but when you want to do point-in-time querying, the history tables contain all the data required.
...and I'm not getting the point of your "IDENTITY" table. I just can't see the use of it. There is something wrong about your concept, and I encourage you to revisit it.
If I ever came across a database at a client with such a table, I'd think it was pretty screwy.
If it's not practically useful, then it's practically useless.