Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9

    Database Design - Sports related...

    Just for some background before I get to the questions:

    I would like to consider myself a pretty good programmer, but when it comes to databases I know enough to be dangerous, enough to get things done, but not enough to get things done properly when the problem gets real complex. I am creating a web based application that will allow users to create their own "leagues" where they can have teams/individuals and track stats.

    I have done this for pro ( american ) football before, but I want to make it flexible enough where I can make it work for just about any sport, be it real, virtual ( video games ), or completely made up.

    My first question revolves around stats for individual games and totals.

    For example, I would want the following stats tracked on a per team basis ( this is very limited to keep this post from getting too long )
    - Points
    - Total Offense
    - Passing Yards
    - Rushing Yards
    - Touchdowns
    ( the opposite stats would be tracked as well, yards allowed, passing yards allowed, points allowed, etc ).

    I currently have a table for both total stats and individual games, the reason being I don't want to have to hit the DB to calculate wins/losses in totals, divisions and conferences every time I want to list the team standings. The only problem I have with having a totals table is that any time the individual games are modified, I have to update the totals as well. I think that updating it 1x as opposed to calculating them everytime takes less time and is less wear and tear on the DB.

    Have I gone about this the wrong way? Is there a better way to do it?

    On a related subject I also want users to be able to fully customize what stats they track ( be they for a team or individual ). Here is where I run into some trouble coming up with the best implementaion. Is it best to have a table that holds the standard stats and then another table that holds the custom stats and refers to the team/individual by a team_id, stat_id and have yet ANOTHER table that holds the stat_id and other information about the stat that is to be tracked. Or should I have a 1 big table with a lot of very generically named fields such as stat_1, stat_2, stat_3, etc etc.

    Thanks in advance!

  2. #2
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    It is generally nota good idea to have a totals table. From your problem description it seems that the volume of data is small, maybe a few thousand rows or so.

    Just use separate queries to do the totals. For the convenience of the application developer, encapsulate the query in a view so that the app developer queries the view. If you have performance problems, then try optimizing the queries in the views. Only as a last resort should you create a summary table.

    For tracking customer preferences on what stats they track, one method may be to create a table called User_Preferences with columns (userID, preferred_stat). The values in the preferred_stat column would match (or map to) the column names of the stats table.

    Each User would have a list of stats that they are interested in. A program would build the SQL query dynamically, execute it, and then present the results to the user.

    Hope that was useful.

    Ravi

  3. #3
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    rajiravi,

    Thanks for the response.

    Let me go into a bit more detail as to why I think I should have a totals table. If you still don't agree with me, then I will look into another way of doing it.

    For team standings and to determine who goes into the playoffs, I need to know the team standings on total wins/losses, division wins/losses, and conference wins/losses...

    So for example, I have a "Game" table

    Such as
    Game(
    Game_ID
    Home_Team_ID
    Away_Team_ID
    Stat1
    Stat2
    Stat3
    ....
    )

    Then a totals table I that would at minimum have
    Team_Totals(
    Wins
    Losses
    DivisionWins
    DivisionLosses
    ConferenceWins
    ConferenceLosses
    )

    It could have totals for stat1, stat2, stat3, etc etc as well.

    The wins, losses, divisionwins, divisionlosses etc are all necessary to properly sort the team standings. Without always having a total of those stats around, I would have to query just on wins and losses, then query again to see if they are in the same division and then query again to see if they are in the same conference.

    If you can see another way of doing it without having the totals table and keeping queries to somewhat of a minimum, I'm all ears.

    Thanks again!

  4. #4
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Yes, even in this case I would recommend that you not create summary tables.

    Rather, you should create views that look like tables to the users.

    Since I expect that the total number of games played in a year by all the teams in the league will be in the thousands, a view will still give you results in a second or two.

    If you post the actual data structures (tables), maybe we can create the queries that will generate the data and then see if the performance is acceptable.

    The reason for creating a view is simple. Tables need to be updated, a view automatically gets the most recent data. If performance is fine, then you get the best deal.

    If performance is not acceptable, then only should we look at other options.

    Ravi

  5. #5
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    rajiravi,

    Again, thanks for the input. Here's my table structures:

    Conference(
    Conference_ID
    Conference_Name
    League_ID
    )

    Division(
    Division_ID
    Division_Name
    Conference_ID
    )

    Teams(
    Team_ID
    Division_ID
    Team_Name
    Team_Location
    Team_Icon
    ..... ( just other info about the team isolated to this table, not needed here )
    League_ID
    )

    Schedule(
    Schedule_ID
    Week_ID
    Schedule_Away_Team_ID
    Schedule_Home_Team_ID
    Schedule_Away_Team_Score
    Schedule_Home_Team_Score
    Schedule_Away_Team_Offense
    Schedule_Home_Team_Offense
    Schedule_Away_Team_Passing
    Schedule_Home_Team_Passing
    Schedule_Away_Team_Rushing
    Schedule_Home_Team_Rushing
    League_ID
    Season_ID
    Schedule_Submitted_Date
    Schedule_Updated
    )

    Across 2 weeks worth of scheduled games for 1 league a team can have 1 home game and 1 away game.

    I'd have to do a sum on the home team score when them team_id = XX and then a sum on the away team score when team_id = XX to get that teams total points scored. I then need to rank that against all the other teams.

    I can't think of a way to do that with this design. Maybe this design is flawed though, I appreciate the input.

  6. #6
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Here's what I've done.

    I created the tables and some dummy data. See attached file "sports_league_ddl.sql"

    Then, I created a view called team_stats with the following structure:

    TEAM_ID
    OPPONENT_ID
    WIN_LOSS
    TEAM_DIVISION_ID
    OPP_DIVISION_ID
    TEAM_CONF_ID
    OPP_CONF_ID
    Using this view as a starting point I wrote a query that generates the statistics for the win-loss (all wins/losses, division wins/losses and conference wins/losses).

    When I run this query on my home computer running Oracle 9i, it gets executed in 0.02 seconds (20 milliseconds). This for a league with 12 teams and a total of 18 games played.

    As you can see, you should be able to get results within a second even when you have a few thousand records. Performance should not be an issue.

    This is what I have been saying all along: Never denormalize unless you can show that the normalized schema has unacceptable performance.

    Here's the query:

    SELECT t.name,
    SUM (CASE WHEN win_loss = 'W' THEN 1 ELSE 0 END) AS wins,
    SUM (CASE WHEN win_loss = 'L' THEN 1 ELSE 0 END) AS losses,
    SUM (CASE WHEN team_division_id = opp_division_id
    THEN CASE WHEN win_loss = 'W' THEN 1 ElSE 0 END
    ELSE 0
    END ) AS division_wins,
    SUM (CASE WHEN team_division_id = opp_division_id
    THEN CASE WHEN win_loss = 'L' THEN 1 ElSE 0 END
    ELSE 0
    END ) AS division_losses,
    SUM (CASE WHEN team_conf_id = opp_conf_id
    THEN CASE WHEN win_loss = 'W' THEN 1 ElSE 0 END
    ELSE 0
    END ) AS conference_wins,
    SUM (CASE WHEN team_conf_id = opp_conf_id
    THEN CASE WHEN win_loss = 'L' THEN 1 ElSE 0 END
    ELSE 0
    END ) AS conference_losses
    FROM team_stats s, teams t
    WHERE t.team_id = s.team_id
    GROUP BY t.name
    /
    And here are the results from running the query.

    NAME WINS LOSSES DIVISION_WINS DIVISION_LOSSES CONFERENCE_WINS CONFERENCE_LOSSES
    ------------ ---------- ---------- ------------- --------------- --------------- -----------------
    East A1 2 1 1 1 2 1
    East A2 3 0 2 0 3 0
    East A3 2 1 1 1 1 1
    East B1 1 2 1 1 1 1
    East B2 1 2 0 2 1 2
    East B3 2 1 1 1 2 1
    West A1 1 2 0 1 0 2
    West A2 1 2 1 0 1 2
    West A3 2 1 1 1 2 1
    West B1 1 2 1 1 1 2
    West B2 1 2 0 1 1 2
    West B3 1 2 1 0 1 1

    12 rows selected.

    Elapsed: 00:00:00.02
    Hope that helps.


    Ravi
    Attached Files Attached Files

  7. #7
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    Wow,

    That is really cool! I definitely need to get a better hold on complex SQL queries.

    My only other question comes down to points scored vs points allowed.

    In a similar fashion to how you did the wins/losses, how can I calculate the points scored by a team ( through all the weeks ) AND the points allowed by a team through all the weeks.

    Thanks again!

  8. #8
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Just like the team wins/losses were created, you could modify the view team_stats to include points, and then sum over them in the query.

    Ravi

  9. #9
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    Ravi,

    Thanks for all your help, I think I've got things under control now!

  10. #10
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    A final question for this ( I hope ).

    With team standings ( for example ), I want to select ONLY the top 1 from each division in each conference.

    I have been unable to come up with a (slick ) query to do so. Can you help point me in the right direction?

    I can always create a query that unions numerous other querys for each conference, but I was wondering if there were a better way.

    Thanks in advance.
    Last edited by big9erfan; 06-09-04 at 02:38.

  11. #11
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    You already have the total wins and losses for each team by division and conference.
    All you have to do is to get the "top" based on whatever criteria you choose.
    Find the "top" within each division.

    Ravi

  12. #12
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    Quote Originally Posted by rajiravi
    You already have the total wins and losses for each team by division and conference.
    All you have to do is to get the "top" based on whatever criteria you choose.
    Find the "top" within each division.

    Ravi

    I tried that, but it didn't return what I expected. I tried to do a
    Select top 1 .... where conf = 'xxx' and div = 'xxx'
    union
    Select top 1 .... where conf = 'xxx' and div = 'xxx'
    union
    Select top 1 .... where conf = 'xxx' and div = 'xxx'
    union
    Select top 1 .... where conf = 'xxx' and div = 'xxx'
    Order by wins, losses ASC

    That didn't return what I expected at all. Should I run individual queries per division?

  13. #13
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    The answer to this question depends on how you define "top team".

    Let us say that the top team is the team with the best overall win/loss record.

    Then your query should try to find the team within each divsion-conference combo with the max(wins/(wins+losses)) value. For other criteria of "top team" you would get the appropriate statistic.

    It should now be easy to create this query. Use parts of the query already provided here and build on it. Select from the query that already calculates the wins/losses for you.

    I would prefer not to provide the query myself, because you will learn a lot if you do it yourself.

    Ravi

  14. #14
    Join Date
    Jun 2004
    Location
    CA
    Posts
    9
    rajiravi,

    I appreciate your help and more pointing me in the right direction than just giving me the answer. I much prefer that than the blanket answer and you're on your way thing.

    The only way I was able to get this to work the way I wanted so that I get ( using your example data )
    East A2
    East B3
    West A3
    West B3

    was doing a query like

    select top 1 team_name, team_conference, team_division where team_conference = 'xxx1' AND team_division = 'yyy1'
    UNION
    select top 1 team_name, team_conference, team_division where team_conference = 'xxx2' AND team_division = 'yyy1'
    UNION
    select top 1 team_name, team_conference, team_division where team_conference = 'xxx3' AND team_division = 'yyy3'
    ordery by wins, losses ASC, division_wins, division_losses asc, conference_wins, conference_losses ASC

    Is there a better way to do it, and if so I would appreciate a hint or a push in the right direction.

    Thanks again!

  15. #15
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Actually, what you've done just gets you the result of the first row for each division and conference.


    For each team, you need to calculate a "score" that identifies the "top" team, then sort the teams based on the score. After that, you hav to group the results by division and conference.

    Hope that helps.

    Ravi

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •