Results 1 to 5 of 5
  1. #1
    Join Date
    Jun 2009
    Posts
    4

    Question Normalization Question - US ZipCodes

    I'm trying hard to understand and apply rules of normalization for a database I'm creating, but I came across a question related to US Zip Codes.

    I was following along with a book that was doing a step by step walk through of normalizing a database (sql server 2005 for developers). In that book, in order to avoid duplicate data, they remove city and state from one table, and add it to a ZipCodes table, storing only ZipCodes in the Address table. Then, they can just do a look-up of city/state by the Zip Code.

    There is only one problem with this...several towns often share the same Zip Code (my town included, shares with two other neighboring towns). So, this doesn't work. They can be differentiated by Zip + 4, but you can't require people to input their Zip + 4, after all, who remembers it? I don't know my +4 off the top of my head.

    Is it really bad normalization practice to keep the city/state in the Address table, even if the same city/state combos will appear over and over again?

    In general, I guess the question is: Is third normal form a practical and necessary goal, or just an 'ideal' that is not always achieved?

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Third normal form is a practical and necessary goal.

    Making City and State depend on Zip Code is incorrect, so any model based on that assumption has to be incorrect right from the git-go.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jun 2009
    Posts
    4
    The book most likely over-simplified the example, as books often do.

    I just saw another example that uses PostalCodeID, and then lists every city that shares the same zip code in a different row. I suppose I could try and get access to the post office API and get the info that way.

  4. #4
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    In the United Kingdom a postal code can in principle be used to determine town and county. In my experience it doesn't usually make sense to create a new table keyed only on postcode however. The reason is that address data is messy. Bits of it may be incomplete or incorrect and address cleansing software isn't 100% or even 90% reliable at sorting it out.

    So for most purposes a postcode does not reliably determine the other bits of the address even though it is supposed to do so. Therefore it is no violation of Boyce Codd Normal Form (the NF that deals with functional dependency) to have postcode and the rest of the address in a single address table.

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Buy a different book.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •