If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Surrogate Key Mania

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-05-03, 18:21
MrSteve MrSteve is offline
Registered User
 
Join Date: Dec 2003
Location: Essex, UK
Posts: 1
Surrogate Key Mania

I am interested in peoples opinions on surrogate keys.

In the past I didn't really see the point of using them since I could nearly always identify natural keys for my tables. Sometimes the natural keys were composites of 2 or 3 fields.

For example, to pinpoint a location on earth you need 3 things.
1. Longitude.
2, Latitude.
3. Height above sea level.

These three fields form a natural key to identify a location.
So we can have a table called "Location" with those 3 fields as a composite primary key. Non-key fields may include, Location name, description etc.

Location
--------------
*Longitude
*Latitude
*Height
LocationName
Description


Now imagine we have a temperature table which stores the avergage temperature for each day for a location

Temperature
------------------
*Longitude
*Latitude
*Height
*Date
Temperature

The primary key of the Temperature table includes the composite key of the Location table and also adds the date.
This is the sort of thing I've done in applications in the past.
Now people are telling me that this sort of thing is not normalized and I should be using a LocationID surrogate on the location table etc.
Well i've studied the Normal forms and I don't see that surrogate keys have anything to do with normalization at all. Just because you have a composite key showing up in a child table doesnt mean things are not normalized. (Am I wrong?)

However I can certainly see the benefits of a surrogate key. If we have a location ID as the primary key in the Location table we can also use it in the Temperature table and it simplifies the relationship (and probably makes indexes smaller and faster). It also isolates child tables from changes in the location. For example, imagine that the longitude had to be corrected for a location. Without the surrogate key we would have to update the longitude on all the child tables. Business data is often subject to change so composite keys are not usually as immutable as you'd like.

One disturbing thing I have noticed at work though is that when people use a surrogate key they seem to think that is all they need to guarantee uniqueness.
"I've added an Autonumber as the key, so now every record is unique!".
Well, yes every Location ID is unique, but what about the real data!
If we make LocationID the surrogate key on the location table, we still have to ensure the uniqueness of the records. Even though Longitude,Latitude and Height are no longer the primary key we must ensure that we cannot have multiple records with the same Longitude
Latitude and Height. i.e. These three fields together need a unique index.
Time and time again people seem to think they no longer have to worry about uniqueness once they have created a surrogate key.
Has anyone else noticed this Surrogate key confusion?

(Or have I just got everything totally wrong here?)
Reply With Quote
  #2 (permalink)  
Old 12-08-03, 06:54
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Re: Surrogate Key Mania

No, you haven't got it wrong. Surrogate keys are a convenience for avoiding various issues, some of which you have mentioned:

1) Natural keys sometimes need to be updated (corrected) and that change has to propagate to dependent tables if the natural key is used as the primary key.

2) Natural keys may be cumbersome to use in foreign keys, e.g. many columns, long text columns etc.

3) Some data doesn't have a natural key that is 100% unique - people being the obvious example.

Anyone who says natural keys are not normalised is wrong, though they may be right in suggesting that a surrogate key is a better choice in a given situation. However, as you have said, wherever a surrogate is used then unique constraints must also be used to enforce any natural keys that exist, and this tends to get overlooked by those who fit a surrogate to everything without thinking.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #3 (permalink)  
Old 12-08-03, 12:02
r123456 r123456 is offline
Registered User
 
Join Date: Sep 2003
Location: The extremely Royal borough of Kensington, London
Posts: 778
"1) Natural keys sometimes need to be updated (corrected) and that change has to propagate to dependent tables if the natural key is used as the primary key."

Just wondering in what way this could be considered an issue? Unless the table was exposed to high volume updates.
__________________
Bessie Braddock: Winston, you are drunk!
Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.
Reply With Quote
  #4 (permalink)  
Old 12-08-03, 12:45
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
It is a technical issue if your DBMS does not support "update cascade", which Oracle for example doesn't, or if your FK constraints are not deferred until commit (which Oracle does allow in later versions).

You need to update the parent PK and all the child FKs together because:
a) you can't update the parent first, otherwise the children's FKs will be violated temporarily
b) you can't update the children first, ditto.

Oracle guru Tom Kyte has published a fairly complex workaround for this involving triggers here:

http://asktom.oracle.com/~tkyte/upda...ade/index.html

One can also imagine other potential issues, such as:
- PK is referenced by millions of child records in various tables, and performing this update will be time/resource-intensive and undesirable
- historic dependent data has been archived and if restored would not correspond to any parent record
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #5 (permalink)  
Old 12-08-03, 13:03
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
mrsteve, one advantage of using the natural keys as in your example of temperatures is that you can run queries against that table to filter by values on any of the parts of the key, e.g. average temps in the tropics (filter by latitude), without having to do a join to the location table, which the use of a surrogate key would force upon you

further, you can often make a decision about whether the natural key realistically has a chance of ever changing -- for example, lat/long likely won't

finally, i'd like to second the remark that "Anyone who says natural keys are not normalised is wrong"

as well, the remark "wherever a surrogate is used then unique constraints must also be used to enforce any natural keys that exist, and this tends to get overlooked by those who fit a surrogate to everything without thinking" bears repeating

i've seen far too many needless surrogate keys

for example, in a many-to-many relationship table:

foo (id PK, fooname)
bar (id PK, barname)
foobar (id PK, fooid FK, barid FK)

the composite (fooid,barid) needs a unique constraint, which is conveniently provided if the composite were made the PK instead of the useless surrogate


rudy
http://r937.com/
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On