Page 1 of 2 12 LastLast
Results 1 to 15 of 30
  1. #1
    Join Date
    Nov 2006
    Posts
    33

    Normalisation trouble - Redundant data still apparent after 3NF =/

    Hi Guys,

    I need to design a database for an online purchasing system,
    here is a list of proposed attributes at UNF
    UNF:
    ORDER
    OrderNo
    OrderDate
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerAddressLine1
    CustomerAddressLine2
    CustomerTown
    CustomerCounty
    CustomerPostcode
    CustomerEmail
    CustomerTelephoneNo
    { ItemName
    ItemDescription
    ItemQuantity
    ItemPrice
    ParcelId
    ParcelSendDate}
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressLine1
    DeliveryAddressLine2
    DeliveryTown
    DeliveryCounty
    DeliveryPostcode
    CardType
    CardNo
    StartDate
    EndDate
    IssueNo
    CardVerificationNo
    NOTE* The repeating group has already been marked

    My progress so far:

    I have managed to normalise the data to the 3NF....

    However, there is still 1 minor problem:

    3NF:
    ORDER-3
    OrderNo
    OrderDate
    CardNo
    DeliveryNo
    CustomerNo

    CUSTOMER-3
    CustomerNo
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerEmail
    CustomerTelephoneNo
    CustomerAddressId

    CUSTOMER-ADDRESS-3
    CustomerAddressId
    CustomerAddressLine1
    CustomerAddressLine2
    CustomerTown
    CustomerCounty
    CustomerPostcode

    DELIVERY-3
    DeliveryNo
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressId

    DELIVERY-ADDRESS-3
    DeliveryAddressId
    DeliveryAddressLine1
    DeliveryAddressLine2
    DeliveryTown
    DeliveryCounty
    DeliveryPostcode

    CARD-3
    CardNo
    CardType
    StartDate
    EndDate
    IssueNo
    CardVerificationNo

    ORDER-ITEM-3
    ItemNo
    CustomerNo
    ParcelId
    ItemQuantity

    PARCEL-3
    ParcelId
    ParcelSendDate

    ITEM-3
    ItemNo
    ItemName
    ItemDescription
    ItemPrice

    As you can see there is still a customer address and delivery address.... its pretty obvious that these are the same thing.... but in its current state it could potentially mean that there would be duplicate records (1 for the customers address and then if same customer orders to himself... 1 for delivery address)


    I need some way of doing this "WITHOUT" breaking the rules of normalisation......


    Now i thought of perhaps considering that each address that a customer delivers to ... could be considered a potential customer... so could be added into the customer table...

    The UNF would be different like this:
    ORDER
    ...
    ..
    ..
    CustomerTitle
    CustomerForename
    CustomerSurname
    ...
    ..
    DeliveryId
    ..
    ..

    The delivery Id would be a link to a customer record and thus eliminating the data redundancy...

    However, is this a valid method "WITHOUT" breaking the rules of normalisation?
    I mean, i have artificially inserted that attribute before normalisation has taken place based on observations....

    Alternatively... can anyone else recommended any other method of solving this problem?

    Thanks in advance

    --Philkills

  2. #2
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Have an Address type column and then use COALESCE?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  3. #3
    Join Date
    Nov 2006
    Posts
    33
    Im sorry, this is my first uni module in database design...

    We have not gone over "COALESCE" yet =/

    Is this the only way it can be done? =/

    btw thx for the reply =-)

  4. #4
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Instead of 2 tables CUSTOMER-ADDRESS-3 and DELIVERY-ADDRESS-3, why not just one:

    ADDRESS-3
    AddressId
    AddressLine1
    AddressLine2
    Town
    County
    Postcode

    The CustomerAddressId and DeliveryAddressId columns in CUSTOMER-3 and DELIVERY-3 can both reference ADDRESS-3.AddressId

  5. #5
    Join Date
    Nov 2006
    Posts
    33
    Quote Originally Posted by andrewst
    Instead of 2 tables CUSTOMER-ADDRESS-3 and DELIVERY-ADDRESS-3, why not just one:

    ADDRESS-3
    AddressId
    AddressLine1
    AddressLine2
    Town
    County
    Postcode

    The CustomerAddressId and DeliveryAddressId columns in CUSTOMER-3 and DELIVERY-3 can both reference ADDRESS-3.AddressId
    yes but, isn't that breaking the rules of normalisation?

  6. #6
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    No, it isn't. What makes you think it is?

  7. #7
    Join Date
    Nov 2006
    Posts
    33
    The fact that it seems as though you are suggesting ... that i artificially join these 2 tables into 1 after i have completed normalisation....

    Maybe im wrong, could you please explain how this is possible by following the rules of normalisation?

  8. #8
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    It is more or less a "repeating group" in the original list of attributes, is it not?

    If the list had also contained:
    OrderTakenByEmployeeId
    OrderTakenByEmployeeName
    OrderTakenByEmployeePhone
    OrderCheckedByEmployeeId
    OrderCheckedByEmployeeName
    OrderCheckedByEmployeePhone

    ... then I hope you would end up with a single EMPLOYEE table, not an ORDER-TAKEN-BY-EMPLOYEE table and an ORDER-CHECKED-BY-EMPLOYEE table!

  9. #9
    Join Date
    Sep 2002
    Location
    Sydney, Australia
    Posts
    255
    PhilKills

    Andrew has already answered the question, so I will not repeat. Re whether it "breaks" Normalisation rules, etc, it needs to be understood that for a uni course it is quite appropriate to evaluate Normalisation in isolation, but good sense, good design rules and Normalisation rules exist in parallel in the same universe, not in separate universes. In fact they have the same goals. They can be applied at the same time in parallel.

    Cheers
    Last edited by DerekA; 11-30-06 at 01:08.
    Derek Asirvadem
    Senior Sybase DBA/Information Architect derekATsoftwaregemsDOTcomDOTau
    Anything worth doing is worth doing Right The First Time
    Spend your money on standards-compliant development or spend 10 times more fixing it

  10. #10
    Join Date
    Nov 2006
    Posts
    33
    Quote Originally Posted by andrewst
    It is more or less a "repeating group" in the original list of attributes, is it not?

    If the list had also contained:
    OrderTakenByEmployeeId
    OrderTakenByEmployeeName
    OrderTakenByEmployeePhone
    OrderCheckedByEmployeeId
    OrderCheckedByEmployeeName
    OrderCheckedByEmployeePhone

    ... then I hope you would end up with a single EMPLOYEE table, not an ORDER-TAKEN-BY-EMPLOYEE table and an ORDER-CHECKED-BY-EMPLOYEE table!
    Ok,

    Based on the assumption that address is a repeating group i have normalised all attributes to 3NF:
    UNF:
    ORDER
    OrderNo
    OrderDate
    {Title
    Forename
    Surname
    AddressLine1
    AddressLine2
    Town
    County
    Postcode
    Email
    TelephoneNo}
    { ItemName
    ItemDescription
    ItemQuantity
    ItemPrice
    ParcelId
    ParcelSendDate}
    CardType
    CardNo
    StartDate
    EndDate
    IssueNo
    CardVerificationNo

    1NF:
    ORDER-1
    OrderNo
    OrderDate
    CardType
    CardNo
    StartDate
    EndDate
    IssueNo
    CardVerificationNo

    ORDER-CUSTOMER-1
    OrderNo
    CustomerNo
    Title
    Forename
    Surname
    AddressLine1
    AddressLine2
    Town
    County
    Postcode
    Email
    TelephoneNo

    ORDER-ITEM-1
    OrderNo
    ItemNo
    ItemName
    ItemDescription
    ItemQuantity
    ItemPrice
    ParcelId
    ParcelSendDate



    2NF:
    ORDER-2
    OrderNo
    OrderDate
    CardType
    CardNo
    StartDate
    EndDate
    IssueNo
    CardVerificationNo

    ORDER-CUSTOMER -2
    OrderNo
    CustomerNo


    CUSTOMER-2
    CustomerNo
    Title
    Forename
    Surname
    AddressLine1
    AddressLine2
    Town
    County
    Postcode
    Email
    TelephoneNo

    ORDER-ITEM-2
    OrderNo
    ItemNo

    ItemQuantity
    ParcelId
    ParcelSendDate

    ITEM-2
    ItemNo
    ItemName
    ItemDescription
    ItemPrice

    3NF:
    ORDER-3
    OrderNo
    OrderDate
    CardNo

    CARD-3
    CardNo
    CardType
    StartDate
    EndDate
    IssueNo
    CardVerificationNo

    ORDER-CUSTOMER -3
    OrderNo
    CustomerNo


    CUSTOMER-3
    CustomerNo
    Title
    Forename
    Surname
    AddressNo
    Email
    TelephoneNo

    ADDRESS-3
    AddressNo
    AddressLine1
    AddressLine2
    Town
    County
    Postcode

    ORDER-ITEM-3
    OrderNo
    ItemNo

    ItemQuantity
    ParcelId

    PARCEL-3
    ParcelId
    ParcelSendDate

    ITEM-3
    ItemNo
    ItemName
    ItemDescription
    ItemPrice

    The only problem with this layout... is that you would need additional assumptions, such as:

    1. The 1st customer occurance in ORDER-CUSTOMER is the customers ID.
    2. The 2nd occurance is the delivery ID
    3. If a customer is sending the goods to theirselfs... then there will be only 1 entry(customers ID).

    However, wouldn't this also slow down queries?

    Such as, if you wanted to just print out all your customers.....etc...

    I realise that you can break the rules of normalisation to improve the design "if necessary"... However, our lecturer has already stated that in this circumstance we "WILL" lose marks if we do this.

    So in conclusion, which solution is better?

    Redundant data..... or performance?

    Or perhaps secret solution number 3? - which i have no idea what it is

    thanx for the help so far guys, starting to really understand how all this fits together ^^

    -- Philkills

  11. #11
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    You have taken a wrong turn somewhere - probably by following some normalisation "cookbook" approach rather than understanding the data.

    What I meant was that instead of this:
    Code:
    CUSTOMER-3
    CustomerNo
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerEmail
    CustomerTelephoneNo
    CustomerAddressId
    
    CUSTOMER-ADDRESS-3
    CustomerAddressId
    CustomerAddressLine1
    CustomerAddressLine2
    CustomerTown
    CustomerCounty
    CustomerPostcode
    
    DELIVERY-3
    DeliveryNo
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressId
    
    DELIVERY-ADDRESS-3
    DeliveryAddressId
    DeliveryAddressLine1
    DeliveryAddressLine2
    DeliveryTown
    DeliveryCounty
    DeliveryPostcode
    ... you could/should have this:
    Code:
    CUSTOMER-3
    CustomerNo
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerEmail
    CustomerTelephoneNo
    CustomerAddressId
    
    DELIVERY-3
    DeliveryNo
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressId
    
    ADDRESS-3
    AddressId
    AddressLine1
    AddressLine2
    Town
    County
    Postcode
    Both CustomerAddressId and DeliveryAddressId are foreign keys that reference AddressId.

    This way, if the customer address and delivery address are the same, you still only store the address attributes once - which is one of the goals of normalisation.

  12. #12
    Join Date
    Nov 2006
    Posts
    33
    Quote Originally Posted by andrewst
    You have taken a wrong turn somewhere - probably by following some normalisation "cookbook" approach rather than understanding the data.

    What I meant was that instead of this:
    Code:
    CUSTOMER-3
    CustomerNo
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerEmail
    CustomerTelephoneNo
    CustomerAddressId
    
    CUSTOMER-ADDRESS-3
    CustomerAddressId
    CustomerAddressLine1
    CustomerAddressLine2
    CustomerTown
    CustomerCounty
    CustomerPostcode
    
    DELIVERY-3
    DeliveryNo
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressId
    
    DELIVERY-ADDRESS-3
    DeliveryAddressId
    DeliveryAddressLine1
    DeliveryAddressLine2
    DeliveryTown
    DeliveryCounty
    DeliveryPostcode
    ... you could/should have this:
    Code:
    CUSTOMER-3
    CustomerNo
    CustomerTitle
    CustomerForename
    CustomerSurname
    CustomerEmail
    CustomerTelephoneNo
    CustomerAddressId
    
    DELIVERY-3
    DeliveryNo
    DeliveryTitle
    DeliveryForename
    DeliverySurname
    DeliveryAddressId
    
    ADDRESS-3
    AddressId
    AddressLine1
    AddressLine2
    Town
    County
    Postcode
    Both CustomerAddressId and DeliveryAddressId are foreign keys that reference AddressId.

    This way, if the customer address and delivery address are the same, you still only store the address attributes once - which is one of the goals of normalisation.
    That is the most logical way of achieving this, however, as far as normalisation is concerned... they are completely unrelated Attributes....

    or else... where does it say : If redundant data is still present based on the observations of the Systems Analyst then it is within his discretion that an artificial link be created to prevent this.

    I only ask as i can't see how you can merge 2 entities while still following the normalisation rules... "OR" create 2 foreign keys to the same entity.

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Philkills
    ...merge 2 entities
    an address is an address is an address

    there is only one address entity, but it has 2 different relationships
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Nov 2006
    Posts
    33
    Quote Originally Posted by r937
    an address is an address is an address

    there is only one address entity, but it has 2 different relationships

    lol yea we as humans can see that...

    But what if you were a computer... and saw DeliveryAddress and CustomerAddress....

    As far as you are concerned(as a computer) they are 2 different things entirely.

    Is this something that is done outside of the normalisation process though?..

    i.e. once you have reached 3NF(or whatever NF you want to go to)..

    Or is this something you can do during the composite logical data diagram.... where you would compare your EAR diagram with the normalisation... (as you should get your solution with the EAR as its more the opinion of the analyst)... and thus you could justify it that way?
    Last edited by Philkills; 11-30-06 at 15:13.

  15. #15
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    I generally have a single address table, and then use FKs to it for each address used anywhere in the system. That way there is an address for an invoice, another one for an employee, a different one for a client, but they all belong in the address table. This makes it lots easier to standardize on handling of addresses, and it means that if you find a problem (such as a check constraint or FK that is too restrictive or permissive), you fix it one time for everybody!

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •