Results 1 to 10 of 10
  1. #1
    Join Date
    Apr 2008
    Posts
    4

    A quesion of star Schema

    hello everyone,

    I am a new member of this forum, I have a question about star schema and I hope those nice guy can help me.

    I am now working with a assignment of Data warehouse, and now I am designing the star schema but get problem.

    The first one I want to ask is did Star Schema only can 1 FACT table for storing calculable information?

    The assignment background is integrate 3 independent DB, 2 is very similar (DB1 & DB2) while 1 is different (DB3). DB1 and DB2 have "project" and "activity" table but DB3 only contains "project" table. 1 Project can contain many activities.

    And now Data warehouse want to store time spent in Project and Activity.
    What can I do about this problem?
    I have thought that using 2 FACT table for storing project time spent and activity time spent, but it seems star schema only can use 1 table. So I want for help.........

    Thank you,
    Saxon

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Are you using database and table interchangably? I can read your question two ways and get very, VERY different meanings.

    Since you are discussing star schemas, I assume that you're using the Kimball approach. There are other star schema disciplines.

    How are you going to handle time, since none of your entities seem to include any kind of time reference?

    -PatP

  3. #3
    Join Date
    Apr 2008
    Posts
    4
    Thank you for answering my question first, you don't understand my question my due to my english problem, sorry about that.

    The attachment is about sample data of 3 DBs.
    DB1 and DB2 is very similar, while inconsistent with DB3.
    In DB1 and DB2, 1 project can contain more than 1 activities, but DB3 do not have activity table.
    Moreover, 1 project contains more than 1 employee and 1 activity also.

    Actually, I don't know what is your meaning of Star schema approach, did Kimball approach is the simplest and common one?

    In this stage, I need to create Star schema from the user requirement, which is used for analyzing time spent of project and activities. After that, I need to do the ETL process from those DBs.

    The reason of why I ask did Star Schema can only contain 1 FACT table is becuase I want to know whether my model is correct or not. The last attachment is my model. I don't know how to deal with between consultant, project and activity, so I haven't add into star schema.
    Attached Thumbnails Attached Thumbnails DB1.GIF   DB2.GIF   DB3.GIF   my_model.JPG  

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Your English wouldn't be taken for a native speaker's, but it is significantly better than my Chinese!

    The problem I have answering your question is that there are many definitions for Star Schema, and different definitions can either permit one or many fact tables. The vast majority of the definitions allow more than one fact table.

    While I don't know what your instructor teaches or thinks, I see a star schema as a presentation mechanism, something that is used to deliver a data mart. In my mind, a star schema is literally the output of a data warehouse, not the warehouse itself.

    Without clearly understanding your assignment, I can only offer advice as an observer. View this with considerable skepticism. I would only use one fact table for what you've described so far. I would roll up the staging data by employee, project, activity, and even though you haven't specified date/time I'd include them in the fact table too.

    -PatP

  5. #5
    Join Date
    Apr 2008
    Posts
    4
    Excuse me, I really don't understand why you said only 1 FACT table will include. The problem I don't know how to deal with is the the FACT table content.

    If the star schema contain project and activity dimension, what will the FACT table include?

    The DB3 do not have activity table, if only 1 FACT table included. When data extracted from DB3 database, how to dealing with a null foreign key "Activity_Key" in the FACT table?

    The FACT table structure

    Project_key, Time_key, employee_key, Activity_key, total_day_spent, total_hour_spent.

    That is the one of problem I don't know how to deal with.

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    NULL values in a fact table are a bad thing, so you need to avoid that. I'd probably create an arbitrary "empty" activity key (simply spaces, or any arbitrary value that isn't likely to ever be a valid activity id) and a matching row in the activity dimension table with a description like "blank, for the case of DB3). This allows you to roll all three of your databases into a single fact table, and avoids the need for a lot of special handling by both your data warehouse administrator and much more importantly by your data mart users.

    There are some data warehouse methodologies that don't like introducing an arbitrary "artifact" like this. I've been doing this kind of thing long enough that I've found this to be the optimum solution for the vast majority of cases.

    One of the primary purposes of a data warehouse (I'd argue the only real purpose) is to allow functional power users to find nuggets of information that are hidden in mountains of data. This kind of solution makes perfect sense to those users, it is easy to explain, easy to understand, and easy to manipulate. It makes their existing queries easy to write, and their yet-unimagined queries a lot more likely to be correct.

    -PatP

  7. #7
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Star schemas are for datamarts. They are a poor design for data warehouses.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Quote Originally Posted by blindman
    Star schemas are for datamarts. They are a poor design for data warehouses.
    Post #4, paragraph 3. Good to see that we agree!

    -PatP

  9. #9
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by Pat Phelan
    Good to see that we agree!
    Bookmarked
    Testimonial:
    pootle flump
    ur codings are working excelent.

  10. #10
    Join Date
    Apr 2008
    Posts
    4
    Thank you for your help!!!
    I will continue do my assignment
    Thank you very much!!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •