Results 1 to 6 of 6
  1. #1
    Join Date
    Feb 2010
    Posts
    4

    Unanswered: Help with the very first steps of a Football/Soccer database

    As I mentioned in another post here I have been struggling to understand databases for around ten years although I usually pick things up fairly easily.

    I am re-starting my attempts to build a database of games, players and statistics for my local football team as well as statistics for other teams in competitions my side has played in.

    As there are just over 5000 matches and just over 1000 teams, I want to try to set the database up from the start to make the initial data entry as simple as possible.



    In English football the matches run in 'Seasons' which generally run from August until May and take their titles from the years they span. For example this Season is 2009/2010.

    The team has played in many different competitions over the years but not all in every season so what I am looking for is a way that when I put a date in the 'Games' table it would restrict the available competitions to those that would have been played in the corresponding season.

    I thought that the way around this would be by using a 'bridging' table called 'CompetitionsBySeason'.

    I have attached a graphic of the wrong way I have always tried to link competitions to seasons, simply linking the named fields together. I know that they should actually be linked by the ID's as foreign keys and that the ID field in 'CompetitionBySeason' is probably superfluous but I have a couple of questions on how to do it this way.

    The reason I have always tried to link the named fields is just that, they have the actual name in it, whereas the ID is just a number. If I know that my team played in the F A Cup in the 2007/2008 season I know what I am looking for in the dropdowns/lookups. However, with using the ID numbers I will have to take a look at other tables to see what the F A Cups ID is and also the 2007/2008 season ID.

    I fully assume that I am missing something simple here as you cannot, surely, have to remember all ID numbers when entering data?

    Also I get confused with joins so would not know if, on this occasion, it needed to be an Inner Join or left or right.

    I have read up on the subject time and time again from many different sources and just cannot grasp it for some reason so am hoping someone would possibly be able to put it in real laymens/idiots terms for me and explain why it would be done like that. Once I understand it it will stay in my head but it is one of those that no matter how many times in the past I have been given links to pages to read up on it, it just does not go in.

    I hope that I have explained it clearly but if not I can try to rephrase it.

    Thanks in advance
    Steve
    Attached Thumbnails Attached Thumbnails SeasonByCompetition.jpg  

  2. #2
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    FWIW I did some fleshing out of a data model for recording events in a football match.. it was aimed at the statistics obsessed (be they media, pundits whatever) so that they coudl identify who did waht when and in what game

    IIRC the model was never finished, they guy I was talking to lost interest (new girlfriend I seem to remember) I think the entities \re all there, just not fully fleshed out

    however it could give a flavour of what may work. bear in mind it was targetted at a comentator type recording of a match so its not what you were looking for
    identifying scores/results may require some refining/. I think the IsHighlightedEvent was intended to identify significant events (eg scoring, bookings, sending off, substituion and so on. The guy wanted the capability to know who scored, when soemone was substituted, then number of bookings per player/team and so on
    Attached Files Attached Files
    I'd rather be riding on the Tiger 800 or the Norton

  3. #3
    Join Date
    Feb 2010
    Posts
    4
    I am not going to pretend for one second that I understand how every bit of the model works, though looking through it carefully it seems to be almost exactly what I am looking for!
    I am a bit statistically obsessed myself so all the things likes bookings and substitutions will a key part of what I am looking to achieve, along with things like minutes played, games since they scored etc etc. I will also be looking for game statistics such as how long the team has gone since an away win.
    I will print it out and have a proper look at it at work tomorrow but may go ahead and create the tables tomorrow evening as a starting point. I could always amend them later if needed.

    The only things that I can see as being missing from what I would be looking for at the moment is, as well as the actual scores, a field for the attendance, but assume that they would all just be extra fields in the Fixture Table.
    Also, footballclubs has a ClubID but no team name. If I had a separate table of team names then they could be possibly linked to the ClubID with 'From Date' and 'To Date' fields. That way I could link the same ID to the various names a team would have had over the years.

    Thanks for taking the time to send that. As I said I will probably set up the tables Monday evening and take it from there.
    Steve

    PS. In the diagram, I assume that the join directions are in the direction of the little arrows at the start of the join, but I will set it up unrelated first.

  4. #4
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    I think the FixtureEventTypes entity needs refining, arguabky what you want is to defien an event type
    so events could be, say...
    Disciplinary (eg booking, sendign off etc)
    Goal Scored
    Substitution
    fouls
    throwins
    corners
    shots on goal (you'd need to think how you produce the shots on target bit (you'd eother double record a goal (one entry for goal, one entry for shot on target) or better add goals scored to shots on target. however you would need to refine how you treat things like own goals

    things this model would struggle with
    say if a match official is replaced (eg referee changes..). you could consider having say two match officials
    say if more thna two players are involved in an incident. you could delegate the players to a sub table which woudl then allow you to handle multiple players involved.
    say if you wanted record activities from players who are not on the field or the substitutes bench (eg Neville from the Scum mouthing off to Tevez

    You would need to think about how you identify home and away games. the natural choice in my mind would be Team A is home, Team B is away (unless you are US based in which its the other way round.

    before creating tables think trhough weach entity.

    the notation is 'crows foot' the single bar is the "master" tabel (the one that holds the primary key) the tripod is the "child table" *the one that refers to teh PK in the mater table.
    dotted lines are 'non identifying' meaning they do not comprise the primary key in the child table

    if you want to knock about the model I'm quite happy to post a copy of it here. you woudl need to download Power*Architect to manipulate and refine the model.
    SQL Power Software - SQL Power Tools
    I'd rather be riding on the Tiger 800 or the Norton

  5. #5
    Join Date
    Feb 2010
    Posts
    4
    Quote Originally Posted by healdem View Post
    I think the FixtureEventTypes entity needs refining, arguabky what you want is to defien an event type
    I fully agree there and there would be a stack of events listed. As well as the ones you mentioned there would be various types of goal scored (open play, penalty, own goal and even maybe header, right foot, left foot, inside area, outside area etc) although I think that some may be in a sub-table as it could be a right foot shot from outside the area.
    Other things that my mind said could go in the Events table would be Started Match, Substituted and Unused Substitute, although part of me says that it would be an idea to create two different Events tables, one for appearance information such as Time On, Time Off, Type of Appearance (started match etc) and another for the 'In Match' Events suchs as the bookings.

    Quote Originally Posted by healdem View Post
    shots on goal (you'd need to think how you produce the shots on target bit (you'd eother double record a goal (one entry for goal, one entry for shot on target) or better add goals scored to shots on target. however you would need to refine how you treat things like own goals
    I would probably keep these as separate entities as for matches other than the ones my team plays, I won't have the shots on target statistics.

    Quote Originally Posted by healdem View Post
    if a match official is replaced (eg referee changes..). you could consider having say two match officials
    say if more thna two players are involved in an incident. you could delegate the players to a sub table which woudl then allow you to handle multiple players involved.
    With the Officials I think that given the amount of times they are replaced in the games I will be tracking it is probably not overly important at the moment. However, as it does happen it is something that I would actually want to get in there eventually.

    Quote Originally Posted by healdem View Post
    say if you wanted record activities from players who are not on the field or the substitutes bench (eg Neville from the Scum mouthing off to Tevez
    Funnily enough an opposing player mouthed off the referee whilst warming up and was given a yellow card. He then gobbed off again and was given a second yellow without having even taken his tracksuit off! So recording those things would be part of it.

    Quote Originally Posted by healdem View Post
    You would need to think about how you identify home and away games. the natural choice in my mind would be Team A is home, Team B is away (unless you are US based in which its the other way round.
    That would be the way I would do it but in my provisional tables the fields are HomeTeam and AwayTeam.

    Quote Originally Posted by healdem View Post
    if you want to knock about the model I'm quite happy to post a copy of it here. you woudl need to download Power*Architect to manipulate and refine the model.
    SQL Power Software - SQL Power Tools
    I will have a look at SQL Power tools later on as I have not seen it before.
    I did have a bit of a look at your model at work and tried to set out my own version with a few tweaks.
    I have used DBDesigner 4 and although it seems pretty straightforward some of the 'point and click' functionality left me confused at the output!
    I have attached what I have got so far and will explain my theories on why I have done it that way, starting with the bits that are confusing me.....

    MATCHES
    The HomeTeam and AwayTeam fields will need to both be taken from the 'TeamNames' table, I think.
    When I put a date in the 'Matches' table I want the Competition Field to be restricted to only those Competitions played in the relevant season. For example, if you put 01/01/1986 in, there would be no point having the Europa League in as an option because the competition did not exist then. And, likewise if you were entering a match on today's date, there would be not point having Fairs Cup as a competition option as that has not been around for years.
    This is why I have created the 'CompetitionBySeason' table so you can assign competitions only to the years they were played.
    Then once you have entered the competition, only the teams that have been assigned to that competition for that season in 'CompetitionTeams' would be available for selection.
    The creation of the links for 'CompetitionTeams' in DBDesigner seems to have gone a bit loopy and added more than I think it needs!
    TEAMSSTADIUMS
    Because a team may have played at several different stadiums over the years, and a stadium may have had several different tenants, I feel that if you assign stadiums to clubs between certain dates, it will automatically know what stadium the game was played at.
    Obviously a Stadium could have two different teams using it as a home ground at the same time but clubs will not have more than one 'home ground' on a given date.
    With my team, we had 5 different grounds before we moved to our current one. We then shared another sides ground while ours was due to be redeveloped but moved back for a few games for various reasons. We then spent the rest of that season and all of the next at the other ground and spent the following at yet another ground before moving back home.

    TEAMNAMES
    Clubs have sometimes had several different names over the years and even my side went from Margate Town to Margate, then to Thanet United and back to Margate again. When entering data I would want Thanet United to be available from 1981-1989 and would not want Margate to be available during that time.
    The difficult part I foresee with this part is when it comes to assigning stadiums and players to teams but would imagine if they were assigned to the actual club and not the team name, the rest of it would fall in to place via the From and To dates.
    I would also want a way to show the Clubs current name and also be able to list previous names.

    CLUBPLAYERS
    Once again a player would only be with a team for a certain period and may be with the team on two separate occasions so I feel if they could be assigned to clubs between certain dates it will restrict the payers available for a match. Obviously if a player is with a club on two different spells he would just appear twice in the table with two different From dates and two different To dates.
    The other thing I would want to add to that table is whether they were a permanent signing or a loan signing.

    So that is the gist of where I am at at the moment but will have another play around tonight and add some more hopefully.
    Obviously, with my inexperience some of the joins are probably wrong and although I did read through your explanation of the notations, because it was before my post work coffee I didn't take it in that much but will have another look.

    I hope I have not overcomplicated any of this but I have tried to give as much information as possible.

    Thanks again
    Steve
    Attached Thumbnails Attached Thumbnails footballDBmodelv1.png  

  6. #6
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    I think its right to take ownership of your own data model
    Just a few comments on mine, not being defensive (well perhaps I am)...
    the plan was that the fixtures would be populated as soon as its known. there is scope to define different competitions (eg league, League Cup, FA Cup and so on)
    within each cometition you then define the year, and the level/round within each year... so that should handle round 1,2....n of say the FA or league cup
    changing club names is realtively trivial, add a child table off footballclubs woudl allow for altenrtaive names with say a date bracket from to to indicate when that neam was valid
    grounds isn't really developed, you could argue that a club has a ground, or that the club should be part of the PK for Locations. however we needed to model where a game may be played say at Wembley, so there would be no owning club so that ruled out ClubId as part of the primary key
    the issue I had with the fixture events tabel si more to do with where people may be ionvolved in the game but noit declared as substitures. the fact a substitute got two bookings without reaching the pitch is fine, thats handled by the ClubsteaemForFixture/PalyersRoles entities where you define who is officially available or woudl be commentatale. if the person isn't on that list then my model wouldn't supposrt a comment

    looking at the players Role I think the FieldRow and Field Position was to be able to record where a player was, eg FieldRow could be Keeper, FieldPosition coudl identify the postion in each row. say if the formation was 442 then the Left Back could be 2,1 the left centre back 2,2 right back 2,4 and so on. The idea was to use negative FieldRow numbers to indicate they were subs (eg -1 substitue keeper), zero to indicate they were there as non playing staff. I didn't take into account a substitution as aprt of the event.. the idea being that when a player was subbed the field row and position would be adjusted accordingly. that would fail when a player is reallocated a role in the game. again thats an artifact of where the model got to before it ceased development

    you'd need to think through the naming/numbering convention.. it may require another entity to record that naming convention you'd need to work out what the possible formations were and allocate numbers accordingly so you could record each formation

    FWIW some brief observations on your model..
    I:d be wary of crating different entites for what are in essence similar thigns, fersinstance a goal is also a shot on target. y'don't need to store separately a goal or the final score. arguably for performance reasons you could roll up the final score if you found performance was an issue. In essence a booking is an event, a goal is an event. Fergie time is an event

    there are matches where there is no true home team (eg semi finals or even replays for the FA Cup).

    I'd argue the link from matches to Teamnames is wrong.. its should be to clubs, OR if you want to got that way fine, but be consistent and link clubplayers to teamnames not clubs

    nationality should be a FK to countries not varchar..... as you are leaving yourself open to problems with spelling and typos, probably that should be nullable to allow for the situations where you don't know the nationality.

    again I'd make the point that a game may be played at a ground where there is no owning club, such as Wembly. if you don't think you model needs to support that then fine. but its posisble (and we all know fans are optimists at heart Margate may amke it to Wembley next year

    I'd expect the competition, competition year (and level eg round 1, round 5 replay) to be part of the match PK... becuase it uniquely describes a specific match.

    competition by season.. the season could be the PK getttign rid of the ID

    eg 20102011 could be the ID for the season, or if you prefer 2010/2011. although you'd need to stick to 4 digit years to avoid issues.
    one issue I can see is that you defien a season start date and end date, which may different for each league or some leagues. again if that not an issue then fine. if it is an issue you would need to define a season for each competition.

    but as stated first off, take ownership of your own model, make it reflect your requirements. theres often different perspectives of what is or isn't required for a dat mdoel. it needs to support your (the user / customer) perceived requirements not someone elses, it needs to support possible changes. if 'all' you want to do is record who Margate have played over the years then thats one requirement, if you need to record say premier league, league cup, FA cup and European cups then that changes the perspective. if you need an online query to fill in the waffle that commentators seem to like to drop in at quiet moments then thats another requirment
    I'd rather be riding on the Tiger 800 or the Norton

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •