Results 1 to 5 of 5
  1. #1
    Join Date
    Dec 2003
    Location
    Essex, UK
    Posts
    1

    Surrogate Key Mania

    I am interested in peoples opinions on surrogate keys.

    In the past I didn't really see the point of using them since I could nearly always identify natural keys for my tables. Sometimes the natural keys were composites of 2 or 3 fields.

    For example, to pinpoint a location on earth you need 3 things.
    1. Longitude.
    2, Latitude.
    3. Height above sea level.

    These three fields form a natural key to identify a location.
    So we can have a table called "Location" with those 3 fields as a composite primary key. Non-key fields may include, Location name, description etc.

    Location
    --------------
    *Longitude
    *Latitude
    *Height
    LocationName
    Description


    Now imagine we have a temperature table which stores the avergage temperature for each day for a location

    Temperature
    ------------------
    *Longitude
    *Latitude
    *Height
    *Date
    Temperature

    The primary key of the Temperature table includes the composite key of the Location table and also adds the date.
    This is the sort of thing I've done in applications in the past.
    Now people are telling me that this sort of thing is not normalized and I should be using a LocationID surrogate on the location table etc.
    Well i've studied the Normal forms and I don't see that surrogate keys have anything to do with normalization at all. Just because you have a composite key showing up in a child table doesnt mean things are not normalized. (Am I wrong?)

    However I can certainly see the benefits of a surrogate key. If we have a location ID as the primary key in the Location table we can also use it in the Temperature table and it simplifies the relationship (and probably makes indexes smaller and faster). It also isolates child tables from changes in the location. For example, imagine that the longitude had to be corrected for a location. Without the surrogate key we would have to update the longitude on all the child tables. Business data is often subject to change so composite keys are not usually as immutable as you'd like.

    One disturbing thing I have noticed at work though is that when people use a surrogate key they seem to think that is all they need to guarantee uniqueness.
    "I've added an Autonumber as the key, so now every record is unique!".
    Well, yes every Location ID is unique, but what about the real data!
    If we make LocationID the surrogate key on the location table, we still have to ensure the uniqueness of the records. Even though Longitude,Latitude and Height are no longer the primary key we must ensure that we cannot have multiple records with the same Longitude
    Latitude and Height. i.e. These three fields together need a unique index.
    Time and time again people seem to think they no longer have to worry about uniqueness once they have created a surrogate key.
    Has anyone else noticed this Surrogate key confusion?

    (Or have I just got everything totally wrong here?)

  2. #2
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171

    Re: Surrogate Key Mania

    No, you haven't got it wrong. Surrogate keys are a convenience for avoiding various issues, some of which you have mentioned:

    1) Natural keys sometimes need to be updated (corrected) and that change has to propagate to dependent tables if the natural key is used as the primary key.

    2) Natural keys may be cumbersome to use in foreign keys, e.g. many columns, long text columns etc.

    3) Some data doesn't have a natural key that is 100% unique - people being the obvious example.

    Anyone who says natural keys are not normalised is wrong, though they may be right in suggesting that a surrogate key is a better choice in a given situation. However, as you have said, wherever a surrogate is used then unique constraints must also be used to enforce any natural keys that exist, and this tends to get overlooked by those who fit a surrogate to everything without thinking.

  3. #3
    Join Date
    Sep 2003
    Location
    The extremely Royal borough of Kensington, London
    Posts
    778
    "1) Natural keys sometimes need to be updated (corrected) and that change has to propagate to dependent tables if the natural key is used as the primary key."

    Just wondering in what way this could be considered an issue? Unless the table was exposed to high volume updates.
    Bessie Braddock: Winston, you are drunk!
    Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.

  4. #4
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    It is a technical issue if your DBMS does not support "update cascade", which Oracle for example doesn't, or if your FK constraints are not deferred until commit (which Oracle does allow in later versions).

    You need to update the parent PK and all the child FKs together because:
    a) you can't update the parent first, otherwise the children's FKs will be violated temporarily
    b) you can't update the children first, ditto.

    Oracle guru Tom Kyte has published a fairly complex workaround for this involving triggers here:

    http://asktom.oracle.com/~tkyte/upda...ade/index.html

    One can also imagine other potential issues, such as:
    - PK is referenced by millions of child records in various tables, and performing this update will be time/resource-intensive and undesirable
    - historic dependent data has been archived and if restored would not correspond to any parent record

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    mrsteve, one advantage of using the natural keys as in your example of temperatures is that you can run queries against that table to filter by values on any of the parts of the key, e.g. average temps in the tropics (filter by latitude), without having to do a join to the location table, which the use of a surrogate key would force upon you

    further, you can often make a decision about whether the natural key realistically has a chance of ever changing -- for example, lat/long likely won't

    finally, i'd like to second the remark that "Anyone who says natural keys are not normalised is wrong"

    as well, the remark "wherever a surrogate is used then unique constraints must also be used to enforce any natural keys that exist, and this tends to get overlooked by those who fit a surrogate to everything without thinking" bears repeating

    i've seen far too many needless surrogate keys

    for example, in a many-to-many relationship table:

    foo (id PK, fooname)
    bar (id PK, barname)
    foobar (id PK, fooid FK, barid FK)

    the composite (fooid,barid) needs a unique constraint, which is conveniently provided if the composite were made the PK instead of the useless surrogate


    rudy
    http://r937.com/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •