If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Keys!!??

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-30-03, 10:36
amigo amigo is offline
Registered User
 
Join Date: Dec 2003
Posts: 3
Keys!!??

Hi.

I am fairly new to relational databases. I am having a bit of difficulty fully understanding the proper use of keys (primary, foreign) in relational databases. I've read several articles on DB design, but I'm having some doubts about my understanding. What I do understand is that the primary key makes records unique and foreign keys are used to relate to primary keys in other tables.

My question is, can one or more foreign key(s) in a table be considered the primary key (without a separate id for the table)?

Let me illustrate...

I have a table called PERSON with PID, First Name, Last Name, Country (PK = PID).
I have a table called SYMPTOM with SID, Name (PK = SID).

I have a table called HISTORY with PID, SID, Date which represents the date each person developed a symptom. To make the records for each record unique, the Primary Key (PK) should be PID,SID (assumption is that each symptom can only develop once, thus one date, and each person can have more than one symptom - more than one person can have the same symptom). PID and SID are both foreign keys to different tables - can they be the primary keys or must I create another key for this table? i.e., HID, PID, SID, Date (PK = HID), where HID is a unique number for this table and it would then be the primary key and PID and SID would be the foreign keys?

Can either way be used? If I can use PID/SID as the PK and avoid creating a unique key HID, my table would be smaller. Is this a matter of pros and cons of using one over the other? If so, what are they?

Thanks very much!
Reply With Quote
  #2 (permalink)  
Old 12-30-03, 10:46
markrem markrem is offline
Registered User
 
Join Date: Sep 2003
Location: Virginia, USA
Posts: 246
Re: Keys!!??

Yes, that can happen. However, many projects will implement dummy keys in all tables and would create a surrogate key using the two foreign keys you mentioned.

Some RDBMS vendors, like Oracle, teach that every table should have a primary key that has 5 characteristics:
1. unique
2. nerver null
3. short
4. numeric
5. meaningless

The first two are obvious. The next two are for performance and memory tuning issues. The fifth characteristic confuses a lot of people, but I have found in my 10 years as a designer/dba that it's very important.

an example of a really bad primary key is the American Social Security Number.
1. it's not unique - due to the original way numbers were issued, there were up to 5 people with the same number. Even today, the numbers of deceased people get reissued.
2. there are 10's of millions of people in the USA without one.
3. it is short.
4. it is numeric.
5. it is not meanlingless. when I got married and my wife took my last name and she got a new SSN. her employer's database used SSN as a primary key, and they could not change the data (easily), so her taxes were going to be paid incorrectly. They finally spent a few hours manually updating the database.
__________________
MarkRem
Author, Oracle Database 10g: From Nuts to Soup
http://www.remidata.com/book_nuts2soup.htm
Reply With Quote
  #3 (permalink)  
Old 12-30-03, 11:08
amigo amigo is offline
Registered User
 
Join Date: Dec 2003
Posts: 3
Re: Keys!!??

Thanks for you reply, Mark.

Hmmm, I think just using the foreign key combinationm PID, SID as my Primary key in my History table satisfies all the criteria you mentioned.

They make the table records unique, they're non-NULL, they're numeric and are meaningless (just sequential numbers) - they're not as short as one numeric ID, but are only two numbers - is this considered short?

This also leads to my next question...

My example was simple. If I have something more complex and end up with a relation table containing 4 foreign keys which together make the table unique, each is a non-null number and meaningless, does it make sense to use the 4 as the Primary Key? At what point does efficiency, space and query complexity begin to degrade when combining multiple keys to form a Primary Key?

Thanks!
Reply With Quote
  #4 (permalink)  
Old 12-30-03, 13:07
certus certus is offline
Registered User
 
Join Date: Dec 2003
Location: Canada
Posts: 710
Composite Keys composed of foreign surrogate keys are acceptable, however dealing with large composite keys could call for a new surrogate key just for coding manageability. Composite keys in my experience do not cause performance hits but they can begin to take up quite a few columns in their children.

Personally, I tend to use surrogate keys for all my tables and do not include foreign keys as part of the primary keys. There are a few design cases where composite keys are a must, but they are few.
__________________
visit: relationary
Reply With Quote
  #5 (permalink)  
Old 12-30-03, 15:02
markrem markrem is offline
Registered User
 
Join Date: Sep 2003
Location: Virginia, USA
Posts: 246
Re: Keys!!??

Personally, I would create a surrogate (dummy) primary key, and I would also create a unique index on the combined foreign keys (i.e., unique on fk1 + fk2 + ....) I would not find fault in anyone who did not create the surrogate in a simple join table.

I remember looking at a Financials application about 6 or 7 years ago. There were many tables where the primary key was composite of 8 columns, and this composite was the foreign key in many other tables. It made coding queries a chore, because in every WHERE clause you had to join tables using all 8 columns.
__________________
MarkRem
Author, Oracle Database 10g: From Nuts to Soup
http://www.remidata.com/book_nuts2soup.htm
Reply With Quote
  #6 (permalink)  
Old 12-30-03, 15:12
sundialsvcs sundialsvcs is offline
Registered User
 
Join Date: Oct 2003
Posts: 706
Re: Keys!!??

Quote:
Originally posted by amigo
... Hmmm, I think just using the foreign key combinationm PID, SID as my Primary key in my History table satisfies all the criteria you mentioned. ...
No, there is one important difference: PID and SID are numbers known to humans. For this reason they should not be primary-keys in the tables.

It may well be that the combination {PID,SID} should be unique, in which case you can define a rule or an index to enforce that restriction. That's fine... but that does not make them good candidates for primary-keys.

The PID and SID values "mean something to someone besides the computer." Therefore, they are subject to change without notice. A salesman somewhere could decide that to win a $10 zillion account the PID should be changed to the customer's favorite color... and zingo! it becomes an application requirement and can't you have it ready by tomorrow?

(Show of hands? Has this happened to you? Thank you. ... Amigo, I rest my case...)

Primary-keys should mean absolutely nothing. Their only purpose should be: to uniquely identify the row in the table in the database of the computer.
__________________
ChimneySweep(R): fast, automatic
table repair at a click of the
mouse! http://www.sundialservices.com
Reply With Quote
  #7 (permalink)  
Old 12-30-03, 15:34
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Sundial, such blanket statements are irresponsible.

The idea that primary keys should be meaningfull only to the database is just one philosphy of database design. It is one that I normally agree with, but for many small applications it is not necessarily required, and therefore not worth the coding overhead.

So what if the PID needs to be changed to the client's favorite color? In a large database this might cause problems for this to occur frequently, but in a small-midsize database it would be inconsequential.

The advantage of using PID as a primary key propogated through sub-tables is that you don't need to join to the parent table to look up the PID if it is already the foreign key in the subtable.

The disadvantage is that the use of "Natural" keys such as Employee IDs leads to composite keys that can muck up relational integrity and require confusing join clauses.

There are, of course, other benefits and costs to both strategies, but the point is that the developer needs to weigh both options and then decide on the best implementation.


blindman
Reply With Quote
  #8 (permalink)  
Old 12-30-03, 17:43
sundialsvcs sundialsvcs is offline
Registered User
 
Join Date: Oct 2003
Posts: 706
Quote:
Originally posted by blindman
Sundial, such blanket statements are irresponsible.

The idea that primary keys should be meaningfull only to the database is just one philosphy of database design. It is one that I normally agree with, but for many small applications it is not necessarily required, and therefore not worth the coding overhead. [...]

blindman
You stand by your opinion, I'll stand by mine. It's good to hear differing views and I don't call that "irresponsible" at all. That's what a forum's for, eh?

My views here are primarily shaped by having spent a year and a half profitably (to me, anyway) re-writing a major app for an insurance provider who had linked everything in the world using a "Provider ID" that was not only full of embedded information but incompatible with the format used by a company they acquired. Suddenly those IDs were ambiguous (causing dupes) and inconsistent (causing data to drop out of sight). That was a nice, $150,000 problem. Nice for me because I got the grands. Not nice for the client because with a better-designed database the money would not have needed to have been spent.

They acquired two more companies after that, and then were acquired, and never had to re-redesign the database. Thanks to me. But the anonymous person who made the original decision probably saw it as "a small database, an okay way to do this that works now, nobody outside of this department's gonna ever see this anyway ..." It was a sensible, defensible decision at the time, but it proved to be a short-sighted one.

The gist of my redesign was to introduce a truly-private primary key and to spin-off the provider-ID resolution to a separate table. And yes, it took eighteen months to do it.
__________________
ChimneySweep(R): fast, automatic
table repair at a click of the
mouse! http://www.sundialservices.com
Reply With Quote
  #9 (permalink)  
Old 12-30-03, 19:34
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
natural versus surrogate... the debate will never end

all blanket statements are bad, including this one

blindman, i mainly agree with your opinions, in this thread as well as many others, but one thing you said is really weird:
Quote:
the use of "Natural" keys such as Employee IDs leads to composite keys that can muck up relational integrity and require confusing join clauses.
either you didn't explain this point very well, or else i'm gonna have to disagree strongly

natural keys cannot possibly muck up relational integrity, and as far as confusing joins are concerned, i'm sorry, i just don't understand how you could think this

let me also make a comment or two on some other points in this thread

beware of just declaring a surrogate out of habit

it's a bad habit to get into, especially if you think that composite keys are something to be avoided

there is an advantage to using composite foreign keys, and that is that you can do simple joins with lookup tables, without having to go "up the chain" joining to parent tables, grandparent tables, etc., just to figure out which child you actually are

e.g.

division -> department -> section -> employee

count the number of employees in the XYZ division

if the department has its own surrogate key, and carries the division foreign key as an attribute (cf. ERwin makes the distinction between identifying and non-identifying relationships), and if section has its own surrogate key, and carries the department foreign key -- which is a surrogate, recall -- as an attribute, and if employee carries the section foreign key -- which is a also surrogate, recall -- as an attribute, then you must join employee to section to department to division in order to determine which employees belong to XYZ division

on the other hand, if division foreign key is propagated down into employee, those joins are not necessary, and even if you wanted to find the name of the XYZ division, you could join employee to division to get it, and not go through section or department

sundial, your comment that primary keys should mean nothing, and that their only purpose is to uniquely identify a row, is a Post Hoc Ergo Propter Hoc fallacy

yes, the primary key should uniquely identify a row

no, this does not imply that it may not have meaning

of course it may have meaning

yes, i know, i know, you guys are gonna jump all over me and say that if the primary key has meaning, then it can change, and if it can change, then your app will get into trouble

well, duh

that's what ON UPDATE CASCADE is for
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #10 (permalink)  
Old 12-30-03, 20:44
certus certus is offline
Registered User
 
Join Date: Dec 2003
Location: Canada
Posts: 710
Heh. I liked that close, rudy.
__________________
visit: relationary
Reply With Quote
  #11 (permalink)  
Old 12-30-03, 22:12
amigo amigo is offline
Registered User
 
Join Date: Dec 2003
Posts: 3
Re: Keys!!??

Quote:
Originally posted by sundialsvcs
No, there is one important difference: PID and SID are numbers known to humans. For this reason they should not be primary-keys in the tables.

It may well be that the combination {PID,SID} should be unique, in which case you can define a rule or an index to enforce that restriction. That's fine... but that does not make them good candidates for primary-keys.

The PID and SID values "mean something to someone besides the computer." Therefore, they are subject to change without notice.
Well, either I didn't explain fully or clearly, or I don't quite understand you...

The PID and SID are meant to be auto-increment numbers (as far as I understand it, they therefore don't mean anything and are not known to humans) - they are not anything like a social security number, for example. Assuming this, was my assumption correct that they would be good candidates for the primary key in the History table? In this scenario, are they considered surrogate keys? I haven't read this term being used in any of the articles I read.

Just remember ... I'm a bit new at this. Thanks.
Reply With Quote
  #12 (permalink)  
Old 12-30-03, 23:22
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
r937, here's what I meant by natural keys sometimes mucking up relation integrity:

In a complex database with multiple relational layers and multiple relationships between tables, each natural key is included in the natural key of it's child tables. Before I knew better I once tried to use natural keys in a schema that had six distinct relational layers. The lowest layer ended up with six columns as a composite natural key. That made the joins hairy. Then I realized that data in the upper table might have more than one relational path to data in the lower tables. That caused cascading updates to fail, because it was trying to update records in the lower tables twice. From then on, I've tended toward surrogate keys except in simple schemas.

blindman
Reply With Quote
  #13 (permalink)  
Old 12-31-03, 00:03
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
amigo:

Yes, your SIDs and PIDs are surrogate keys.

Yes, they are probably good candidates for a composite key in the History table. If there are child table build off of the History table (unlikely) then you should consider creating another surrogate key in the history table so that the lower tables don't end up using the SIDs, PIDs, KIDs, NIDs, VIDs, ZIDs, and whatever else as mongo composite keys.

That's just my rule of thumb, of course. This is the sort of judgement that can only be made with experience, so create your schema and get some!

blindman
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On