Page 1 of 4 123 ... LastLast
Results 1 to 15 of 58

Thread: Normalization

  1. #1
    Join Date
    Jan 2004
    Posts
    4

    Normalization

    I have normalized my database, but I still have one table that I am not sure about. It has a primary key that is related to my main object. I have one column that stores a date and the other 9 columns store codes from another table and the codes are only 3-5 characters. So my question is should I remove the codes and store the code id's instead. I know the easy answer is yes, but since the codes are so small is it going to make any difference? The sql statements are easier to write to when I store the codes instead of the id's.

    Thanks

  2. #2
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171

    Re: Normalization

    3-5 character codes are fine, as long as they are primary/unique keys to the parent table and you declare a foreign key for each one. However, if you think the codes will often change (e.g. code for "Boomerang" may change from "BOOM" to "BMRG" for some reason) and you want to avoid cascading updates, you could use surrogate keys ("ID"s) instead.

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    my concern would be with the other 9 columns that store codes from another table

    that sounds like it violates first normal form

    however, consider the example of firstname and lastname

    both are names, and yet, they are semantically different

    you hardly ever see a dataasbe design where firstname and lastname are not side by side in the same table, i.e. in apparent violation of first normal form if you consider them "the same"

    another example: home phone, business phone

    but what about these 9 codes? are there always going to be 9 of them? and if you switched them around, i.e. reshuffled the values in the 9 slots, would it make a difference?

    see, you can't really switch firstname and lastname, or home phone and business phone, because they really aren't the same thing

    i guess the *real* acid test is your search sql -- if you wanted to find the date that a particular code was added, would you have to search all 9 fields using ORs
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jan 2004
    Posts
    4
    The nine colums will always be the same. There is a colum for each day of the week and 2 for other things. Each row will have a id number, a code for each day of the week and the 2 other ones, and a date. r937 can you see anything wrong with this? actually I just saw something, I will need the Id number of the parent object so that I will be able to store multiple dates for each parent object. Anything else?

    Thanks again

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    can i see anything "wrong" with it? you mean, other than the fact that it's not in first normal form?

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Jan 2004
    Posts
    4
    So to make it 1st normal form I will need to use the code id's, instead of the codes themselves and the rest of the table design is okay?

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    no, to make it first normal form, you do not have 9 codes side by side in a row -- that's a "repeating group"

    i might do something like this --

    create table childobject
    ( parentobjectid integer
    foreign key references parentobject (parentobjectid)
    , childdate datetime
    , codetype smallint
    , code char(3)
    foreign key references codetable (code)
    , primary key (parentobjectid, childdate)
    )

    where codetype has one of 9 values (for sun through mon plus xx1 and xx2)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jan 2004
    Posts
    4

    Thumbs up

    r937 thanks I didn't even think of that. I wondered about that in 1st normal form, but since they were already codes I didn't even think about it. Thanks again

  9. #9
    Join Date
    Mar 2003
    Location
    Australia
    Posts
    59
    r937,

    That is a very intersting take on 1NF and I think it is wrong.

    I always thought that 1NF was that each attribute must have a single value only. It has nothing to do with key dependencies.

    As far as I am aware "repeating groups" come under 2NF and it is a terrrible misconception that it has been labelled 1NF.

    Your example for HomePhone and BusinessPhone is a good example.
    They are distinct attributes (they do not depend on each other), contain single values and most probably really wholey on the key (3NF).

    Believe it or not, if the posters 7 "Day" columns are each dependant on the key (we can't know this without the full FD's), then it is possible for the schema to be in 3NF!

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    you know what? i think you're right

    i've been doing this (normalization) so long, it has become automatic, and i've lost touch with the correct terminology

    i realize now that what i have been calling not-1NF probably is

    but, as in this case, that often does not make it good design, though

    thanks for the wake-up call

    i shall no longer tell people their designs violate 1NF when i see tables like

    id
    amt1
    amt2
    amt3
    amt4
    amt5

    heck, that's probably 3NF!!!





    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Mar 2003
    Location
    Australia
    Posts
    59
    I know what you mean...

    Most times you can look at a schema and just "know" that it violates some form.

    But without the FD's.. its just (hopefully) an educated guess.

  12. #12
    Join Date
    Feb 2004
    Posts
    8

    Re: Normalization

    Originally posted by athaynes
    I have normalized my database

    What do you mean by normalized? Do you mean that you have gone up to a certain normal form e.g. 3rd.

    If you can physically put the data into a DBMS system then it has to be in at least 1NF.

    The problem with the word normalised is that it has an imprecise meaning. It would be better if you could say the form the database is in.

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    not trying to be argumentative, but you can physically put this into any database:
    Code:
    create table items
         ( id integer
         , categories varchar(200)
         )
    insert into items values ( 1, '12,14,16,18,21' )
    insert into items values ( 2, '9,10,56' )  
    insert into items values ( 3, '11,34,67,937' )
    but i would not say it was 1NF

    i see this type of structure all the time by programmers making the (obviously difficult) transition to databases

    tell them to "normalize" by creating a relationship table with a single column for the category id, and half of them complain that they would rather use those neato string handling functions in their programming language and "join" to the categories table by querying one category at a time in a loop
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    I agree with you, but this is where it all gets a bit fuzzy. From a purist point of view, this is in 1NF, because each categories column contains a single value, of domain character string (and the fact that it contains commas is irrelevant - the database is ignorant of them). But of course, it goes against the spirit of 1NF and is in fact being used by the programmer to "work around" the pesky 1NF restriction, and is precisely as reprehensible as building a true 0NF table, if it were possible to do so...

  15. #15
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    tony, that's it exactly -- the "spirit" of normalization (there's a good sig line in there somewhere)

    going back to the 9 columns that contain codes from another table, this violates the spirit too, but as byrmol pointed out, technically that design is probably 3NF

    still, i would hate to write queries against a table with 9 side-by-side columns that have to join to the same table

    normalization has advantages and you will only realize those advantages if you do not violate the spirit of normalization
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •