If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > A quesion of star Schema

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-06-08, 03:50
saxontse saxontse is offline
Registered User
 
Join Date: Apr 2008
Posts: 4
A quesion of star Schema

hello everyone,

I am a new member of this forum, I have a question about star schema and I hope those nice guy can help me.

I am now working with a assignment of Data warehouse, and now I am designing the star schema but get problem.

The first one I want to ask is did Star Schema only can 1 FACT table for storing calculable information?

The assignment background is integrate 3 independent DB, 2 is very similar (DB1 & DB2) while 1 is different (DB3). DB1 and DB2 have "project" and "activity" table but DB3 only contains "project" table. 1 Project can contain many activities.

And now Data warehouse want to store time spent in Project and Activity.
What can I do about this problem?
I have thought that using 2 FACT table for storing project time spent and activity time spent, but it seems star schema only can use 1 table. So I want for help.........

Thank you,
Saxon
Reply With Quote
  #2 (permalink)  
Old 04-06-08, 10:12
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Are you using database and table interchangably? I can read your question two ways and get very, VERY different meanings.

Since you are discussing star schemas, I assume that you're using the Kimball approach. There are other star schema disciplines.

How are you going to handle time, since none of your entities seem to include any kind of time reference?

-PatP
Reply With Quote
  #3 (permalink)  
Old 04-06-08, 10:34
saxontse saxontse is offline
Registered User
 
Join Date: Apr 2008
Posts: 4
Thank you for answering my question first, you don't understand my question my due to my english problem, sorry about that.

The attachment is about sample data of 3 DBs.
DB1 and DB2 is very similar, while inconsistent with DB3.
In DB1 and DB2, 1 project can contain more than 1 activities, but DB3 do not have activity table.
Moreover, 1 project contains more than 1 employee and 1 activity also.

Actually, I don't know what is your meaning of Star schema approach, did Kimball approach is the simplest and common one?

In this stage, I need to create Star schema from the user requirement, which is used for analyzing time spent of project and activities. After that, I need to do the ETL process from those DBs.

The reason of why I ask did Star Schema can only contain 1 FACT table is becuase I want to know whether my model is correct or not. The last attachment is my model. I don't know how to deal with between consultant, project and activity, so I haven't add into star schema.
Attached Thumbnails
A quesion of star Schema-db1.gif   A quesion of star Schema-db2.gif   A quesion of star Schema-db3.gif   A quesion of star Schema-my_model.jpg  
Reply With Quote
  #4 (permalink)  
Old 04-06-08, 11:51
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Your English wouldn't be taken for a native speaker's, but it is significantly better than my Chinese!

The problem I have answering your question is that there are many definitions for Star Schema, and different definitions can either permit one or many fact tables. The vast majority of the definitions allow more than one fact table.

While I don't know what your instructor teaches or thinks, I see a star schema as a presentation mechanism, something that is used to deliver a data mart. In my mind, a star schema is literally the output of a data warehouse, not the warehouse itself.

Without clearly understanding your assignment, I can only offer advice as an observer. View this with considerable skepticism. I would only use one fact table for what you've described so far. I would roll up the staging data by employee, project, activity, and even though you haven't specified date/time I'd include them in the fact table too.

-PatP
Reply With Quote
  #5 (permalink)  
Old 04-06-08, 12:10
saxontse saxontse is offline
Registered User
 
Join Date: Apr 2008
Posts: 4
Excuse me, I really don't understand why you said only 1 FACT table will include. The problem I don't know how to deal with is the the FACT table content.

If the star schema contain project and activity dimension, what will the FACT table include?

The DB3 do not have activity table, if only 1 FACT table included. When data extracted from DB3 database, how to dealing with a null foreign key "Activity_Key" in the FACT table?

The FACT table structure

Project_key, Time_key, employee_key, Activity_key, total_day_spent, total_hour_spent.

That is the one of problem I don't know how to deal with.
Reply With Quote
  #6 (permalink)  
Old 04-06-08, 13:01
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
NULL values in a fact table are a bad thing, so you need to avoid that. I'd probably create an arbitrary "empty" activity key (simply spaces, or any arbitrary value that isn't likely to ever be a valid activity id) and a matching row in the activity dimension table with a description like "blank, for the case of DB3). This allows you to roll all three of your databases into a single fact table, and avoids the need for a lot of special handling by both your data warehouse administrator and much more importantly by your data mart users.

There are some data warehouse methodologies that don't like introducing an arbitrary "artifact" like this. I've been doing this kind of thing long enough that I've found this to be the optimum solution for the vast majority of cases.

One of the primary purposes of a data warehouse (I'd argue the only real purpose) is to allow functional power users to find nuggets of information that are hidden in mountains of data. This kind of solution makes perfect sense to those users, it is easy to explain, easy to understand, and easy to manipulate. It makes their existing queries easy to write, and their yet-unimagined queries a lot more likely to be correct.

-PatP
Reply With Quote
  #7 (permalink)  
Old 04-07-08, 01:00
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Star schemas are for datamarts. They are a poor design for data warehouses.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #8 (permalink)  
Old 04-07-08, 07:32
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Quote:
Originally Posted by blindman
Star schemas are for datamarts. They are a poor design for data warehouses.
Post #4, paragraph 3. Good to see that we agree!

-PatP
Reply With Quote
  #9 (permalink)  
Old 04-07-08, 07:33
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
Quote:
Originally Posted by Pat Phelan
Good to see that we agree!
Bookmarked
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #10 (permalink)  
Old 04-07-08, 11:04
saxontse saxontse is offline
Registered User
 
Join Date: Apr 2008
Posts: 4
Thank you for your help!!!
I will continue do my assignment
Thank you very much!!!
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On