If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Normalisation trouble - Redundant data still apparent after 3NF =/

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-27-06, 15:06
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Normalisation trouble - Redundant data still apparent after 3NF =/

Hi Guys,

I need to design a database for an online purchasing system,
here is a list of proposed attributes at UNF
UNF:
ORDER
OrderNo
OrderDate
CustomerTitle
CustomerForename
CustomerSurname
CustomerAddressLine1
CustomerAddressLine2
CustomerTown
CustomerCounty
CustomerPostcode
CustomerEmail
CustomerTelephoneNo
{ ItemName
ItemDescription
ItemQuantity
ItemPrice
ParcelId
ParcelSendDate}
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressLine1
DeliveryAddressLine2
DeliveryTown
DeliveryCounty
DeliveryPostcode
CardType
CardNo
StartDate
EndDate
IssueNo
CardVerificationNo
NOTE* The repeating group has already been marked

My progress so far:

I have managed to normalise the data to the 3NF....

However, there is still 1 minor problem:

3NF:
ORDER-3
OrderNo
OrderDate
CardNo
DeliveryNo
CustomerNo

CUSTOMER-3
CustomerNo
CustomerTitle
CustomerForename
CustomerSurname
CustomerEmail
CustomerTelephoneNo
CustomerAddressId

CUSTOMER-ADDRESS-3
CustomerAddressId
CustomerAddressLine1
CustomerAddressLine2
CustomerTown
CustomerCounty
CustomerPostcode

DELIVERY-3
DeliveryNo
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressId

DELIVERY-ADDRESS-3
DeliveryAddressId
DeliveryAddressLine1
DeliveryAddressLine2
DeliveryTown
DeliveryCounty
DeliveryPostcode

CARD-3
CardNo
CardType
StartDate
EndDate
IssueNo
CardVerificationNo

ORDER-ITEM-3
ItemNo
CustomerNo
ParcelId
ItemQuantity

PARCEL-3
ParcelId
ParcelSendDate

ITEM-3
ItemNo
ItemName
ItemDescription
ItemPrice

As you can see there is still a customer address and delivery address.... its pretty obvious that these are the same thing.... but in its current state it could potentially mean that there would be duplicate records (1 for the customers address and then if same customer orders to himself... 1 for delivery address)


I need some way of doing this "WITHOUT" breaking the rules of normalisation......


Now i thought of perhaps considering that each address that a customer delivers to ... could be considered a potential customer... so could be added into the customer table...

The UNF would be different like this:
ORDER
...
..
..
CustomerTitle
CustomerForename
CustomerSurname
...
..
DeliveryId
..
..

The delivery Id would be a link to a customer record and thus eliminating the data redundancy...

However, is this a valid method "WITHOUT" breaking the rules of normalisation?
I mean, i have artificially inserted that attribute before normalisation has taken place based on observations....

Alternatively... can anyone else recommended any other method of solving this problem?

Thanks in advance

--Philkills
Reply With Quote
  #2 (permalink)  
Old 11-27-06, 15:26
Brett Kaiser Brett Kaiser is offline
Window Washer
 
Join Date: Nov 2002
Location: Jersey
Posts: 10,303
Have an Address type column and then use COALESCE?
__________________
Brett
8-)

It's a Great Day for America everybody!

dbforums Yak CorralRadio 'Rita
dbForums Member List
I'm Good Once as I ever was

The physical order of data in a database has no meaning.
Reply With Quote
  #3 (permalink)  
Old 11-27-06, 16:05
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Im sorry, this is my first uni module in database design...

We have not gone over "COALESCE" yet =/

Is this the only way it can be done? =/

btw thx for the reply =-)
Reply With Quote
  #4 (permalink)  
Old 11-28-06, 07:28
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Instead of 2 tables CUSTOMER-ADDRESS-3 and DELIVERY-ADDRESS-3, why not just one:

ADDRESS-3
AddressId
AddressLine1
AddressLine2
Town
County
Postcode

The CustomerAddressId and DeliveryAddressId columns in CUSTOMER-3 and DELIVERY-3 can both reference ADDRESS-3.AddressId
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #5 (permalink)  
Old 11-28-06, 10:47
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Quote:
Originally Posted by andrewst
Instead of 2 tables CUSTOMER-ADDRESS-3 and DELIVERY-ADDRESS-3, why not just one:

ADDRESS-3
AddressId
AddressLine1
AddressLine2
Town
County
Postcode

The CustomerAddressId and DeliveryAddressId columns in CUSTOMER-3 and DELIVERY-3 can both reference ADDRESS-3.AddressId
yes but, isn't that breaking the rules of normalisation?
Reply With Quote
  #6 (permalink)  
Old 11-28-06, 12:47
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
No, it isn't. What makes you think it is?
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #7 (permalink)  
Old 11-28-06, 16:14
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
The fact that it seems as though you are suggesting ... that i artificially join these 2 tables into 1 after i have completed normalisation....

Maybe im wrong, could you please explain how this is possible by following the rules of normalisation?
Reply With Quote
  #8 (permalink)  
Old 11-29-06, 05:44
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
It is more or less a "repeating group" in the original list of attributes, is it not?

If the list had also contained:
OrderTakenByEmployeeId
OrderTakenByEmployeeName
OrderTakenByEmployeePhone
OrderCheckedByEmployeeId
OrderCheckedByEmployeeName
OrderCheckedByEmployeePhone

... then I hope you would end up with a single EMPLOYEE table, not an ORDER-TAKEN-BY-EMPLOYEE table and an ORDER-CHECKED-BY-EMPLOYEE table!
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #9 (permalink)  
Old 11-29-06, 23:31
DerekA DerekA is offline
Registered User
 
Join Date: Sep 2002
Location: Sydney, Australia
Posts: 255
PhilKills

Andrew has already answered the question, so I will not repeat. Re whether it "breaks" Normalisation rules, etc, it needs to be understood that for a uni course it is quite appropriate to evaluate Normalisation in isolation, but good sense, good design rules and Normalisation rules exist in parallel in the same universe, not in separate universes. In fact they have the same goals. They can be applied at the same time in parallel.

Cheers
__________________
Derek Asirvadem
Senior Sybase DBA/Information Architect derekATsoftwaregemsDOTcomDOTau
Anything worth doing is worth doing Right The First Time
Spend your money on standards-compliant development or spend 10 times more fixing it

Last edited by DerekA; 11-30-06 at 00:08.
Reply With Quote
  #10 (permalink)  
Old 11-30-06, 10:11
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Quote:
Originally Posted by andrewst
It is more or less a "repeating group" in the original list of attributes, is it not?

If the list had also contained:
OrderTakenByEmployeeId
OrderTakenByEmployeeName
OrderTakenByEmployeePhone
OrderCheckedByEmployeeId
OrderCheckedByEmployeeName
OrderCheckedByEmployeePhone

... then I hope you would end up with a single EMPLOYEE table, not an ORDER-TAKEN-BY-EMPLOYEE table and an ORDER-CHECKED-BY-EMPLOYEE table!
Ok,

Based on the assumption that address is a repeating group i have normalised all attributes to 3NF:
UNF:
ORDER
OrderNo
OrderDate
{Title
Forename
Surname
AddressLine1
AddressLine2
Town
County
Postcode
Email
TelephoneNo}
{ ItemName
ItemDescription
ItemQuantity
ItemPrice
ParcelId
ParcelSendDate}
CardType
CardNo
StartDate
EndDate
IssueNo
CardVerificationNo

1NF:
ORDER-1
OrderNo
OrderDate
CardType
CardNo
StartDate
EndDate
IssueNo
CardVerificationNo

ORDER-CUSTOMER-1
OrderNo
CustomerNo
Title
Forename
Surname
AddressLine1
AddressLine2
Town
County
Postcode
Email
TelephoneNo

ORDER-ITEM-1
OrderNo
ItemNo
ItemName
ItemDescription
ItemQuantity
ItemPrice
ParcelId
ParcelSendDate



2NF:
ORDER-2
OrderNo
OrderDate
CardType
CardNo
StartDate
EndDate
IssueNo
CardVerificationNo

ORDER-CUSTOMER -2
OrderNo
CustomerNo


CUSTOMER-2
CustomerNo
Title
Forename
Surname
AddressLine1
AddressLine2
Town
County
Postcode
Email
TelephoneNo

ORDER-ITEM-2
OrderNo
ItemNo

ItemQuantity
ParcelId
ParcelSendDate

ITEM-2
ItemNo
ItemName
ItemDescription
ItemPrice

3NF:
ORDER-3
OrderNo
OrderDate
CardNo

CARD-3
CardNo
CardType
StartDate
EndDate
IssueNo
CardVerificationNo

ORDER-CUSTOMER -3
OrderNo
CustomerNo


CUSTOMER-3
CustomerNo
Title
Forename
Surname
AddressNo
Email
TelephoneNo

ADDRESS-3
AddressNo
AddressLine1
AddressLine2
Town
County
Postcode

ORDER-ITEM-3
OrderNo
ItemNo

ItemQuantity
ParcelId

PARCEL-3
ParcelId
ParcelSendDate

ITEM-3
ItemNo
ItemName
ItemDescription
ItemPrice

The only problem with this layout... is that you would need additional assumptions, such as:

1. The 1st customer occurance in ORDER-CUSTOMER is the customers ID.
2. The 2nd occurance is the delivery ID
3. If a customer is sending the goods to theirselfs... then there will be only 1 entry(customers ID).

However, wouldn't this also slow down queries?

Such as, if you wanted to just print out all your customers.....etc...

I realise that you can break the rules of normalisation to improve the design "if necessary"... However, our lecturer has already stated that in this circumstance we "WILL" lose marks if we do this.

So in conclusion, which solution is better?

Redundant data..... or performance?

Or perhaps secret solution number 3? - which i have no idea what it is

thanx for the help so far guys, starting to really understand how all this fits together ^^

-- Philkills
Reply With Quote
  #11 (permalink)  
Old 11-30-06, 13:25
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
You have taken a wrong turn somewhere - probably by following some normalisation "cookbook" approach rather than understanding the data.

What I meant was that instead of this:
Code:
CUSTOMER-3
CustomerNo
CustomerTitle
CustomerForename
CustomerSurname
CustomerEmail
CustomerTelephoneNo
CustomerAddressId

CUSTOMER-ADDRESS-3
CustomerAddressId
CustomerAddressLine1
CustomerAddressLine2
CustomerTown
CustomerCounty
CustomerPostcode

DELIVERY-3
DeliveryNo
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressId

DELIVERY-ADDRESS-3
DeliveryAddressId
DeliveryAddressLine1
DeliveryAddressLine2
DeliveryTown
DeliveryCounty
DeliveryPostcode
... you could/should have this:
Code:
CUSTOMER-3
CustomerNo
CustomerTitle
CustomerForename
CustomerSurname
CustomerEmail
CustomerTelephoneNo
CustomerAddressId

DELIVERY-3
DeliveryNo
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressId

ADDRESS-3
AddressId
AddressLine1
AddressLine2
Town
County
Postcode
Both CustomerAddressId and DeliveryAddressId are foreign keys that reference AddressId.

This way, if the customer address and delivery address are the same, you still only store the address attributes once - which is one of the goals of normalisation.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #12 (permalink)  
Old 11-30-06, 13:40
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Quote:
Originally Posted by andrewst
You have taken a wrong turn somewhere - probably by following some normalisation "cookbook" approach rather than understanding the data.

What I meant was that instead of this:
Code:
CUSTOMER-3
CustomerNo
CustomerTitle
CustomerForename
CustomerSurname
CustomerEmail
CustomerTelephoneNo
CustomerAddressId

CUSTOMER-ADDRESS-3
CustomerAddressId
CustomerAddressLine1
CustomerAddressLine2
CustomerTown
CustomerCounty
CustomerPostcode

DELIVERY-3
DeliveryNo
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressId

DELIVERY-ADDRESS-3
DeliveryAddressId
DeliveryAddressLine1
DeliveryAddressLine2
DeliveryTown
DeliveryCounty
DeliveryPostcode
... you could/should have this:
Code:
CUSTOMER-3
CustomerNo
CustomerTitle
CustomerForename
CustomerSurname
CustomerEmail
CustomerTelephoneNo
CustomerAddressId

DELIVERY-3
DeliveryNo
DeliveryTitle
DeliveryForename
DeliverySurname
DeliveryAddressId

ADDRESS-3
AddressId
AddressLine1
AddressLine2
Town
County
Postcode
Both CustomerAddressId and DeliveryAddressId are foreign keys that reference AddressId.

This way, if the customer address and delivery address are the same, you still only store the address attributes once - which is one of the goals of normalisation.
That is the most logical way of achieving this, however, as far as normalisation is concerned... they are completely unrelated Attributes....

or else... where does it say : If redundant data is still present based on the observations of the Systems Analyst then it is within his discretion that an artificial link be created to prevent this.

I only ask as i can't see how you can merge 2 entities while still following the normalisation rules... "OR" create 2 foreign keys to the same entity.
Reply With Quote
  #13 (permalink)  
Old 11-30-06, 13:56
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by Philkills
...merge 2 entities
an address is an address is an address

there is only one address entity, but it has 2 different relationships
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #14 (permalink)  
Old 11-30-06, 14:09
Philkills Philkills is offline
Registered User
 
Join Date: Nov 2006
Posts: 33
Quote:
Originally Posted by r937
an address is an address is an address

there is only one address entity, but it has 2 different relationships

lol yea we as humans can see that...

But what if you were a computer... and saw DeliveryAddress and CustomerAddress....

As far as you are concerned(as a computer) they are 2 different things entirely.

Is this something that is done outside of the normalisation process though?..

i.e. once you have reached 3NF(or whatever NF you want to go to)..

Or is this something you can do during the composite logical data diagram.... where you would compare your EAR diagram with the normalisation... (as you should get your solution with the EAR as its more the opinion of the analyst)... and thus you could justify it that way?

Last edited by Philkills; 11-30-06 at 14:13.
Reply With Quote
  #15 (permalink)  
Old 11-30-06, 14:24
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
I generally have a single address table, and then use FKs to it for each address used anywhere in the system. That way there is an address for an invoice, another one for an employee, a different one for a client, but they all belong in the address table. This makes it lots easier to standardize on handling of addresses, and it means that if you find a problem (such as a check constraint or FK that is too restrictive or permissive), you fix it one time for everybody!

-PatP
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On