Page 1 of 2 12 LastLast
Results 1 to 15 of 20
  1. #1
    Join Date
    Dec 2007
    Posts
    5

    Question How far is too far - normalization question.

    Ok so I've just started with database design and I think normalization is pretty logical... My problem is, knowing when to stop creating tables. I mean ultimately you could have a whole bunch of two-column tables...

    So a simple scenario...
    There are people selling things. These things can be sold in many different locations. Each item has a category (food, drink, furniture etc).

    Consider this database:

    People Table
    PID (Primary Key)
    Surname
    Name

    Category Table
    CID (Primary Key)
    Name

    Price Table
    PriceID (Primary Key)
    PID
    IID
    Price

    Location Table
    LID (Primary Key)
    Country
    City
    Suburb

    Item Table
    IID (Primary Key)
    Name
    CID
    LID


    So, many items can have the same location, and many items can be sold by the one person. Items are not unique to a location, so multiple items can have the same name but a different location.

    As far as I can tell this database is at least 3NF. (anyone think otherwise?)

    My question is, is it overkill to also create a table for countries, and maybe even cities? Because in the location table, the same country can appear multiple times, so can the same city, and definitely the same suburb. But since its rare New York or USA will actually change its name, is it necessary for me to create a table:

    Country Table
    CountryID (Primary Key)
    Name

    ? Would I then do it for City and Suburb too?


    Thanks for any suggestions!


    Zdenek

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Hi

    The question you are asking is not to do with normalisation. Remember that the consideration for repeating data is horizontal not vertical. Normalisation is about dependancies between columns not whether or not the same data can appear in n rows for a given column.

    Check out this wee beastie:
    http://www.tonymarston.net/php-mysql...se-design.html

  3. #3
    Join Date
    Dec 2007
    Posts
    5
    oooh ok. that makes sense.

    thanks!

  4. #4
    Join Date
    May 2005
    Location
    San Antonio, Texas
    Posts
    134
    hehe, you could break those out but it is generally accepted not to do that. it would be extreme and you really wouldn't get much benefit.

    I am curious about what you have there though. Each person has a different price for each item?

    The way it is there you could have this
    Sam selling lounge chair #4 for $44
    Max selling lounge chair #4 for $55

    Obviously you know the domain better than I do, but I was curious about that. Is it intended for item prices to be linked to multiple people and items?
    Vi veri veniversum vivus vici
    By the power of truth, I, a living man, have conquered the universe

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Use of surrogates in a logical design makes things a bit trickier to read IMHO.

    I have a problem with the Item table (probably related to amthomas's problem with the price table - though I think for a different reason)

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Zdenek
    My question is, is it overkill to also create a table for countries, and maybe even cities?
    yes

    Quote Originally Posted by Zdenek
    But since its rare New York or USA will actually change its name, is it necessary for me to create a table:

    Country Table
    CountryID (Primary Key)
    Name
    no!!!!!!

    instead, you could create a country table like this:

    Country Table
    Name (Primary Key)

    now, you may ask yourself, how did i get here? oops, sorry, flashback to the 80s...

    now, you may ask yourself, what would the purpose of such a table be? well, for one thing, with a foreign key to this table, you would be guaranteed to prevent the insertion of a country name that wasn't pre-registered in the country table, which is called relational integrity


    by the way, pootle, there is no such thing as a surrogate key in logical modelling

    surrogate keys may—may, not must—be defined when entities are translated into tables in physical modelling

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by r937
    by the way, pootle, there is no such thing as a surrogate key in logical modelling

    surrogate keys may—may, not must—be defined when entities are translated into tables in physical modelling
    That's kind of my point - item id and item name in a logical design confuzzle me. I doubt the intention is that there can any number of items with the same name but cannot be sure with all these IDs flying around and no AKs specified.

    Totally agreed BTW re your coutry table however I would like to emphasise that is not now any "more" normalised - it is better designed though, sure

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    letting the days go by, water flowing under me...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    Dec 2007
    Posts
    5
    Quote Originally Posted by amthomas
    Obviously you know the domain better than I do, but I was curious about that. Is it intended for item prices to be linked to multiple people and items?
    Today 17:32
    Pretty much yes. Its not the exact domain of the problem. Its waaay more complicated. I've never done anything with databases to this level of complexity before so I just presented the above problem to get an understanding for things The idea is, Joe at Walmart sells a Coke for $44 while Mike at McDonalds sells it for $55. Different people, places and prices. But it's a Coke everywhere.

    Quote Originally Posted by pootle flump
    Use of surrogates in a logical design makes things a bit trickier to read IMHO.......I doubt the intention is that there can any number of items with the same name but cannot be sure with all these IDs flying around and no AKs specified.
    Ok I wasn't up to scratch with the lingo, but from what I can tell, you're asking why have an items name & a key for it, when its name COULD be the key? The answer is from above. There are many items with the same name. But since their locations differ they aren't the same item. BUT at the same time the location ID can't be the key for an Item because items can come from the same loaction. AND to make it even worse, Joe at Walmart & Roger at Walmart can both be selling coke. While that seems absurd, its just a relationship i had to create for the real problem domain...


    Quote Originally Posted by r937
    ...with a foreign key to this table, you would be guaranteed to prevent the insertion of a country name that wasn't pre-registered in the country table, which is called relational integrity
    Thanks for this tip. Definitely gonna use it.

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Zdenek
    There are many items with the same name. But since their locations differ they aren't the same item.
    i'm going to suggest that you re-think this

    Coke is Coke is Coke, no matter who sells it

    have a look at the can -- there is a barcode with the same GTIN on every one

    the GTIN, by the way, is an example of a great natural key, and i recommend that you use it
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Dec 2007
    Posts
    5
    Quote Originally Posted by r937
    the GTIN, by the way, is an example of a great natural key, and i recommend that you use it
    i agree but the problem comes in when the items don't ALL have natural keys. while again, this might sound weird, items could range from a can of Coke... to an entire restaurant... to an entire country itself. but thats part of the real problem domain, and the problem I'm paid to sort out I just needed a bit of background info.

  12. #12
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    so, continue your search for an item primary key, then

    (have GTIN as an optional attribute, and use it when it applies)

    note: a primary key does not have to be a single column

    it is common for entities to have a surrogate key, and thus a single-column PK, but not relations

    it is usually wrong to have a surrogate key in a relationship table

    thus, item_id and location_id might be surrogate keys in their respective tables, but the relationship table (this item is sold in this location) would have a composite primary key of ( item_id , location_id )
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  13. #13
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by r937
    letting the days go by, water flowing under me...
    You disagree?

  14. #14
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    i was merely trying to evict that earworm by quoting some more of the lyrics echoing around my largely empty mind

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  15. #15
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I think you are humouring me

    With or without the "u".

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •