Results 1 to 6 of 6
  1. #1
    Join Date
    Oct 2010
    Posts
    2

    How to design master data model in Data Warehouse

    Dear Colleagues,

    I'm designing a data warehouse. The key object that links all the databases is an employee. An employee has the following attributes: name, username, phone number, manager. All the attributes are important and I have to track their changes.
    E.g. we have telephony database, and there phone number is the key, so in DW I have to link the fact of the call with proper employee.
    Then, consider the followin example. The manager of the employee changed from M1 to M2 effective of T1. If I want to check employees perofrmance in general, then I don't care who is his manager. But if the manager M2 wants to track KPIs of his team, then we should include the employee into the result set strating from T1. And so on...
    All the data is volatile, so I decided to do the following:
    1. Create the table IDENTITY (id PK); it has ONLY ONE column, that is it. It has only one purpose - to keep changes of employee master data linked together.
    2. Create the table EMPLOYEES (id PK; identity_id FK REF IDENTITY.id; name; username; phone; mamanger_id FK REF EMPLOYEES.id, start_date, end_date).

    Now consider another example. Irina is working in the company for 5 years. We have the following rows in EMPLOYEES table:

    201 | 3 | Ivanova, Irina | SPBIVANOI | 1111111 | 51 | 27.01.2005 | 31.05.2006
    ... after some time ...
    943 | 3 | Petrova, Irina | MONPETROI | 2222222 | 42 | 01.10.2010 | NULL

    As you can see there are the following changes:
    1) Last name from Ivanova to Petrova;
    2) Account name from SPBIVANOI to MONPETROI;
    3) Phone number from 1111111 to 2222222;
    4) Manager from 51 to 42.
    But identity_id column (the second one, = 3) keep all the changes linked.

    I need your opinion regarding this architecture. Does it make sence? Do you know any other approaches on how to keep high-volatile master data in a DW in consistent state?

    Thank you.

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Slowly changing dimension - Wikipedia, the free encyclopedia
    What you are describing is type 2: Correct?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  3. #3
    Join Date
    Oct 2010
    Posts
    2

    Talking

    Exactly! Thank you very much and sorry for dumb question! Seems that I have to re-read DW theory again

  4. #4
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    It's not a dumb question at all; don't worry.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Type 4, from that article, is the absolute only method I would ever use and trust.
    One table that holds current values, and an archive table that holds all historical values (INCLUDING the current value!).
    The majority of your queries will run quickly against the current values table, but when you want to do point-in-time querying, the history tables contain all the data required.

    ...and I'm not getting the point of your "IDENTITY" table. I just can't see the use of it. There is something wrong about your concept, and I encourage you to revisit it.
    If I ever came across a database at a client with such a table, I'd think it was pretty screwy.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I'll bet an old hand like you has used Type 1 once or twice
    Testimonial:
    pootle flump
    ur codings are working excelent.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •