Results 1 to 8 of 8
  1. #1
    Join Date
    Oct 2007
    Posts
    6

    Database / Data Warehouse Setup Advice

    Hello All

    I have a situation which has given me the chance to develop a data warehouse but before I continue I need to know if it's viable with the data I have available.

    I have three different sources of data (databases) but I have created views on each of the databases which summarises all the data that the end user's use.

    Here is a summary of the views:

    Player's Diaries
    - AnswerDate
    - PlayerId
    - QuestionId
    - AnswerValue

    Fitness Stats
    - TestDate
    - TestId
    - ProgramId
    - PlayerId
    - Result
    - ResultUnit

    Game Stats
    - GameDate
    - PlayerId
    - TimePlayed
    - Kicks

    Now, I'm kinda stuck on how I would put this into a data warehouse... if that is the right solution? If it is, how would I set this up in a star schema?

    Thanks in Advance!
    Rgds

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    First of all, a star schema is really a data mart. It is not a data warehouse.

    Next, you need to decide what the output needs to be and whether a warehouse will help you present that output or not.

    To answer your questions (both implicit and explicit), yes this schema could become a data warehouse and yes, it could become any of several star schemas.

    -PatP

  3. #3
    Join Date
    Oct 2007
    Posts
    6
    Thanks for the clearing up on the terminology.

    All the data that I need to report/query on is in the views, I would just like to combine them together also I would like to keep adding more views.

    I'm looking to grab data out for all three views for certain playerid's over time.

    Is there any chance you could provide an example of how I would represent the current views in a star schema?

    Rgds
    Last edited by schone; 10-12-07 at 22:44.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    I don't know what you want the star schema to show, but if you can give me some clues about what you want the star schema to show, I can probably help you put an example together.

    The point that I think you are missing is that a star schema is just a way to present data, so there are a huge number of ways to create the schema based on what you want it to help you analyze. Without knowing what your goal is, the present views are good enough.

    -PatP

  5. #5
    Join Date
    Oct 2007
    Posts
    6
    Hi Pat,

    Thanks again for the reply.

    Here is my situation:

    I have 3 databases which data is stored into on a weekly basis, at the moment I have to connect & query each database then produce a report on that database then group all the data into one report, this is very time consuming. I'm looking to have a centralized location where I can query this location once and produce a report using all three data sources.

    Just to clarify:

    Players Diary - A diary system where players enter in responses to 7 questions 3 times a week.

    Fitness Stats - A database which contains data on each players on how long it took them to do certain drills (ie. 10 km, 100m runs)

    Game Stats - A database which contains statistics on each player during a game. (ie. Kicks, Catches etc.)

    I'm looking to be able to grab a report on certain player's for a certain time period, which shows player diary responses for questions, the fitness stats for that certain time and the game stats during that period.

    I would like to do averages, sum's etc on the diary responses, fitness stats, game stats etc.

    Hopefully this helps?

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    No, I still haven't been able to make clear what I need in order to be able to help you. I've run out of time for the moment, I need to work on a couple of personal projects (actually these two are for my fianc&#233 today and tomorrow, but I'll write up a better explanation for you tomorrow night.

    I understand what your basic schema is and how it works, although the extra explanation helps clarify a few points there too. The problem is that a data mart (what you see as a star schema) needs to have a purpose or a focus. If you think about how you'd use the mart, think about what problems you have aggregating data today and how you could trade storage space in order to reduce query time.

    I just don't have the time to contrive a good example right now, but I'll get that done in about 36 hours or so.

    -PatP

  7. #7
    Join Date
    Oct 2007
    Posts
    6
    Hi Pat

    Thanks for your time once again!

    I await your reply

    Thanks in advance!

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Uff-da! I'm sorry that it has taken me so long to reply to this thread.

    Lets shift the concepts a bit in order to make the examples of what I'm trying to show easier. I'm assuming that you are familiar with bowling, and I'm familiar with the things that bowling analysts are interested in seeing, and these examples are easier for me to work with than the abstraction of your example.

    Within the game of bowling, the atomic element of data is a single throw of the ball, but it is only possible to score at the frame level (and that score can affect the results of previous frames too. This means that the complexity of keeping score goes up drastically because of the need to track interactions between varying numbers of balls and the interaction can change depending on the results of intermediate throws. This is the kind of scoring that makes normal SQL coders quesy!

    For the moment, we'll ignore the idea of score in the game context as well as worrying about specific pin interactions and interpretations (which are very high interest to bowling analysts too).

    Logically, what we've discussed so far leads us to the following tables in our staging area:

    teams
    teamid
    name

    bowlers
    bowlerid
    teamid
    name

    frames
    frameid
    bowlerid
    base_ball
    fill_ball
    bonus_ball

    One of the thing that are very significant to a bowling analyst is a "strike" which is any frame where the first_ball knocks down all ten pins. Another significant event is a "Mark" where either the first_ball knocks down all ten pins, or the combined first_ball and the fill_ball knock down ten pins. Another significant event is a "Strike out", which is the tenth frame of a game where the first_ball, fill_ball, and bonus_ball each knock down ten pins.

    As you can see from what I've presented so far, score hasn't even entered into the picture... We aren't to the point where we even care yet about which frames are relative to which other frames. We already have several important measures, and the star schema is starting to break down, and we still haven't gotten to what the average spectator focuses on!

    Take a stab at a star schema for what is here so far, and talk about the problems that you hit and those that you foresee... That discussion may help a lot to cut down the amount of detail we need to pursue on this model before we can return to your original model.

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •