Dear Colleagues,
I'm designing a data warehouse. The key object that links all the databases is an employee. An employee has the following attributes: name, username, phone number, manager. All the attributes are important and I have to track their changes.
E.g. we have telephony database, and there phone number is the key, so in DW I have to link the fact of the call with proper employee.
Then, consider the followin example. The manager of the employee changed from M1 to M2 effective of T1. If I want to check employees perofrmance in general, then I don't care who is his manager. But if the manager M2 wants to track KPIs of his team, then we should include the employee into the result set strating from T1. And so on...
All the data is volatile, so I decided to do the following:
1. Create the table IDENTITY (id PK); it has ONLY ONE column, that is it. It has only one purpose - to keep changes of employee master data linked together.
2. Create the table EMPLOYEES (id PK; identity_id FK REF IDENTITY.id; name; username; phone; mamanger_id FK REF EMPLOYEES.id, start_date, end_date).
Now consider another example. Irina is working in the company for 5 years. We have the following rows in EMPLOYEES table:
201 | 3 | Ivanova, Irina | SPBIVANOI | 1111111 | 51 | 27.01.2005 | 31.05.2006
... after some time ...
943 | 3 | Petrova, Irina | MONPETROI | 2222222 | 42 | 01.10.2010 | NULL
As you can see there are the following changes:
1) Last name from Ivanova to Petrova;
2) Account name from SPBIVANOI to MONPETROI;
3) Phone number from 1111111 to 2222222;
4) Manager from 51 to 42.
But identity_id column (the second one, = 3) keep all the changes linked.
I need your opinion regarding this architecture. Does it make sence? Do you know any other approaches on how to keep high-volatile master data in a DW in consistent state?
Thank you.