Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Sep 2007
    Posts
    6

    Question Database normalization

    I am building a database to manage HR data for the company that I work for. I have a question about normalization for anybody that might be able to help...

    For each member of staff I need to store either one or two sets of emergency contact details, e.g. name, address phone number etc..
    The simplest way to do it is to add columns to staff_member table as follows:


    staff_member
    ========
    field1
    field2
    .
    .
    emergencycontact1_name
    emergencycontact1_phonenumber
    emergencycontact1_mobilenumber
    .
    .
    emergencycontact2_name
    emergencycontact2_phonenumber
    emergencycontact2_mobilenumber
    .
    .
    .


    Does this count as breaking normalization? I know it's repeated groups of columns but it's only repeated once..

    I guess I could also have a table called emergency..


    staff_member
    ========
    field1
    field2

    emergency
    ========
    name
    phonenumber
    mobilenumber


    The question then is: is it then better to add two fields to staff_member e.g. emergencycontact1 and emergencycontact2 that point to the corresponding records in emergency, or add a field to emergency that points to the corresponding staff member.

    Any help or ideas would be appreciated
    Thanks

  2. #2
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Quote Originally Posted by stevewelly
    Does this count as breaking normalization?
    No, but it is not 3NF.

    However, in this particular situations you are going to get two very different answers.
    My personal choice would be to have the emergency contact information in a separate table with a foreign key relationship to the employees.

    But... I can think of at least one SQL guru on this forum who will agree, but suggest you implement your first method.

    In short, in this specific situation there are 2 "correct" answers and you could get away with either - it's up to you to decide which is the best method for you.

    Would there *ever* be a need for a 3rd emergency contact?
    George
    Home | Blog

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by stevewelly
    is it then better to add two fields to staff_member e.g. emergencycontact1 and emergencycontact2 that point to the corresponding records in emergency, or add a field to emergency that points to the corresponding staff member.
    the latter

    but in your case, if you are willing to bet your job that there will only ever need to be 2 contacts, ever, then go ahead and put them into the main table

    yes, that breaks 1NF -- if not the actual letter of the law, then certainly the spirit
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Quote Originally Posted by georgev
    But... I can think of at least one SQL guru on this forum who will agree, but suggest you implement your first method.
    Quote Originally Posted by r937
    but in your case, if you are willing to bet your job that there will only ever need to be 2 contacts, ever, then go ahead and put them into the main table
    My prediction was correct
    George
    Home | Blog

  5. #5
    Join Date
    Sep 2007
    Posts
    6
    Thanks for your help

    No, there will never be a 3rd contact. I'm thinking that the first method is certainly the easiest to implement - and less likely to go wrong considering how I plan to layout the form onscreen.

    Thanks again

  6. #6
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Before you decide to implement the first method - I have to ask you something else.

    How many employees (percentage wise) do you predict will have
    1. 0 emergency contacts
    2. 1 emergency contact
    3. 2 emergency contacts

    Furthermore; how do you plan on displaying the information?
    Personally I'd opt for a subform/list approach, with which I imagine the latter method being far easier to implement.
    George
    Home | Blog

  7. #7
    Join Date
    Sep 2007
    Posts
    6
    There will never be zero, or at least very very rarely. I imagine there will be a roughly 50/50 split between 1 and 2

    Thanks

  8. #8
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Another thing to consider, what happens when an employee removes their first choice of emergency contact? Is #2 supposed to become #1? Or can #1 be left blank? When you add a second one in, which "slot" does it go into?

  9. #9
    Join Date
    Sep 2007
    Posts
    6
    Quote Originally Posted by MCrowley
    Another thing to consider, what happens when an employee removes their first choice of emergency contact? Is #2 supposed to become #1? Or can #1 be left blank? When you add a second one in, which "slot" does it go into?
    Well the way I was planning to layout the form was to have two sets of textboxes, each labelled Name, Phone etc.. and the user just fills in as they like.. so yes #1 could be blank

  10. #10
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Very good point Crowley!

    Yeah, I'd opt for option 1 still - not been sold to the other option yet!
    George
    Home | Blog

  11. #11
    Join Date
    Mar 2007
    Location
    636f6d7075746572
    Posts
    770
    Here's a question regarding NULL (empty) storage.

    Suppose your employee only has 1 contact, and that that contact ONLY has a phone number (or is unwilling to supply mobile number).

    In your current (with normalisation) form you will have 4 empty fields (NULL):
    emergencycontact1_mobilenumber
    emergencycontact2_name
    emergencycontact2_phonenumber
    emergencycontact2_mobilenumber

    If you normalise your data into a separate table, you now have 1 emtpy field:
    emergencycontact1_mobilenumber

    I see that as wasting 4 times as much space... And i'm not a fan of blank or NULL columns

    edit : and going back to points provided by GeorgeV (and emphasised by Rudy), what happens when you scale this solution. At the current time you can only see there being a MAX of 2 emergency contacts, however when someone wants to add a 3rd (oh yes, you might not see it but it's a possibility - maybe a change in company policy), you now have to add 3 more columns to your table, which will be 99.999999% blank (NULL) because only 1 employee now needs all three. So your space wastage has grown exponentially. Whereas adding 1 row into another table doesn't waste any space at all....

    Just some things for you to ponder there.
    Last edited by aschk; 09-12-07 at 06:13.

  12. #12
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    someone should point out to aschk that NULLs do not "waste" space

    for comparison purposes, how much space is wasted in a VARCHAR if the value is an empty string?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  13. #13
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    0 bytes
    varchar[(n)]
    Variable-length non-Unicode character data with length of n characters. n must be a value from 1 through 8,000. Storage size is the actual length of the data entered, not n bytes. The data entered can be 0 characters in length.
    I will point out that if the need for a 3rd emergency contact was added - then there'd also need to be a change to the FE. If the data was in 3NF then (depending on your display method) there would not.
    George
    Home | Blog

  14. #14
    Join Date
    Sep 2007
    Posts
    6
    Thanks for everybody's views on this.. I decided in the end to go for the fully normalised approach ..
    However I do have another query.. The main employee table is rapidly approaching 100 columns due the the sheer amount of data that needs to be stored. Over half of those are bits of medical data such a boolean for whether or not the person smokes, or a integer for their weight..
    Is having that many columns in a table inherently a Bad Thing or is it OK considering it's normalised properly, but just contains a lot of information.. Obviously it would probably be a good idea to avoid "SELECT *" queries, but then I would do that as a good practice thing anyway


    EDIT: removing typo

  15. #15
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    it is perfectly okay
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •