Welcome to the dBforums forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions, articles and access our other FREE features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload your own photos and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact support.

If you prefer not to see double-underlined words and corresponding ads, place your cursor
here for ContentLink opt out.

Go Back  dBforums > General > Database Concepts & Design > Design considerations about primary and foreign keys

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-28-07, 22:52
rdx23 rdx23 is offline
Registered User
 
Join Date: Sep 2007
Posts: 9
Smile Design considerations about primary and foreign keys

Hi everyone, here´s my question:

Consider that you want a model a problem in which there are N buildings, numbered sequentially, each building has N floors numbered sequentially and each floor has N rooms numbered sequentially too. So there are several ways to model this:

A)

Buildings

building_num INTEGER (PRIMARY KEY)

Floors

floor_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY) (FOREIGN KEY to Buildings)

Rooms

room_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY) (FOREIGN KEY to Buildings)
floor_num INTEGER (PRIMARY KEY) (FOREIGN KEY to Floors)

As you can see there´s a lot of redundancy in this tables, but if I conserve the model this way it will support easily changes in requirements like storing other data about floors without a lot of work. Now consider the following version:

B)

Buildings

id_building INTEGER (PRIMARY KEY)
num_building INTEGER

Floors

id_floor INTEGER (PRIMARY KEY)
id_building INTEGER (FOREIGN KEY to Buildings)
num_floor INTEGER

Rooms

id_room INTEGER (PRIMARY KEY)
id_floor (FOREIGN KEY to Floors)
num_floor

With this approach we could reference from another table a single row of Rooms table with only the id_room value which is an autonumeric value but I would have to examine each table primary, foreign keys and other attributes in order to identify the desired room what could cause query complexity and performance hit in the future if those tables become very large. Also for this design I would need 2 indexes per table (the primary key and a unique constraint for num_x).

Another way would be having a table called Rooms with all the data it requires and omitting tables Buildings and Floors like this way:

C)

Rooms

room_num INTEGER (PRIMARY KEY)
floor_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY)

But the real problem with this is that I if I want to reference a single row of this table I would have to conserve a composite key of 3 columns in the referencing tables which duplicates a lot of data.

And maybe another way could be the following:

D)

Rooms

id_room INTEGER (PRIMARY KEY)
room_num INTEGER (PRIMARY KEY)
floor_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY)

But this will need again two indexes (primary key and unique constraint) instead of only 1 (primary key).

Every table in above possible solutions is normalized to the third normal form but as you can see each approach has its advantages and disadvantages but I´m not sure what to do? which one do you consider the best? is there another way to model this situation?, is there something I haven´t consider? in simple words... WHAT DO YOU THINK??????

Last edited by rdx23 : 09-28-07 at 23:07.
Reply With Quote
  #2 (permalink)  
Old 09-29-07, 00:36
sco08y sco08y is offline
Registered User
 
Join Date: Oct 2002
Location: Fort Polk, LA
Posts: 500
A is wrong. Should say:

room_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY)
floor_num INTEGER (PRIMARY KEY)
FOREIGN KEY building_num, floor_num ON Floors

Why do you think there's "a lot of redundancy" in A? If you implement these foreign key constraints, you can tell the DBMS to cascade updates and deletes, so there's no additional work to manage it on your part. Using surrogates as you do in other examples requires extra code to create the surrogates.

Personally, I've never been a big fan of surrogate keys. But, the fact is that building numbers and room numbers are *already* surrogate keys, so your designs B and D are using surrogates to surrogates. I don't think you can really simplify your SQL much by using an extra layer of surrogates, and the usual performance claims don't apply when everything is a bunch of ints.

I think that if you went with C you'd wind up doing A anyway, simply because you'll want to store information about buildings and floors at some point. For instance, if you want to store street addresses, you'd want a buildings table to put that in. And if you need to record floor layouts, it only makes sense to stash that in a floors table. So my thinking is design A, with the correction I made above, of course.
Reply With Quote
  #3 (permalink)  
Old 09-29-07, 01:13
rdx23 rdx23 is offline
Registered User
 
Join Date: Sep 2007
Posts: 9
Hi!, you´re totally right, correct foreign key for table Rooms in design A is like you said:

room_num INTEGER (PRIMARY KEY)
building_num INTEGER (PRIMARY KEY)
floor_num INTEGER (PRIMARY KEY)
FOREIGN KEY building_num, floor_num ON Floors

I´m sorry about the mistake.

About redundancy in design A I meen that you carry each table´s information through all the others, what doesn´t happen with C, furthermore if you have Buildings and Floors tables you need extra space to store the same information contained in Rooms table and also you would need space for indexes too, but as you said, maybe sometime we could need to store information about buildings but my biggest concern is about other tables that could reference Rooms table if it doesn´t have a surrogate key because you would have to propagate 3 columns everywhere you need to reference it.
Reply With Quote
  #4 (permalink)  
Old 09-29-07, 05:12
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 13,556
Quote:
Originally Posted by rdx23
... because you would have to propagate 3 columns everywhere you need to reference it.
and the problem with this is ... ?

actually, i think it's a benefit

if there actually is something that needs to refer to the room (e.g. a desk), it automatically refers to the floor and building as well, and you do not have to join to the floor table and/or the building table in order to find out which ones they are

last time i checked, disk space was very cheap and excess processing was to be studiously avoided to make the app faster...

__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon
Reply With Quote
  #5 (permalink)  
Old 09-29-07, 17:32
rdx23 rdx23 is offline
Registered User
 
Join Date: Sep 2007
Posts: 9
ok, it´s true.. it´s better to propagate the 3 columns to other tables in terms of performance but will you choose design A or C if you know that you won´t need to store information about Buildings and Floors in the short term? I think that if C is choosen and in the future this need appears it should be easy to modify the database to add it the Buildings and Floors tables.. am I right? and let me ask you something else.. what do you think about alphanumeric primary keys? is there something wrong with them? should they always be numbers?
Reply With Quote
  #6 (permalink)  
Old 09-30-07, 00:21
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 13,556
Quote:
Originally Posted by rdx23
what do you think about alphanumeric primary keys? is there something wrong with them? should they always be numbers?
i love them, no there isn't, and no they shouldn't always
__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon
Reply With Quote
  #7 (permalink)  
Old 09-30-07, 00:53
rdx23 rdx23 is offline
Registered User
 
Join Date: Sep 2007
Posts: 9
So it seems everybody hates surrogate keys, this is a little bit difficult because it says something different everywhere I look for information about this issue, maybe the whole problem can be summarized in: should you use composite primary keys for entitys?, and I meen keys composed of 4, 5 or even 6 columns instead of surrogate keys? there are two big considerations and I think both are completely valid: Surrogate keys are independent of bussines changes and natural keys (Frequently composite keys) improves performance minimizing the number of tables where you need to look for when you make a query.
Reply With Quote
  #8 (permalink)  
Old 09-30-07, 01:08
pkstormy pkstormy is offline
Moderator
 
Join Date: Dec 2004
Location: Madison, WI
Posts: 2,830
I use a combination of both methods.
__________________
Expert Database Programming
MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)
Reply With Quote
  #9 (permalink)  
Old 09-30-07, 11:33
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 9,291
Gotta disagree with Rudy here.

Go with design B. Composite cascading key annoy the hell out of me and become particularly problematic when your schema has four, five, or more levels, or when you have multiple paths such as:
Code:
|-->B--| A--| |-->D |-->C--|

You can already derive building from floors, so it is redundant to store building ID in the Rooms table. Unless you are dealing with a terrabyte of data, you'll never notice a difference in processing and it will make administration of the data much easier.
__________________
If it's not practically useful, then it's practically useless.

blindman
http://sqlblindman.googlepages.com/main
Reply With Quote
  #10 (permalink)  
Old 09-30-07, 11:47
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 13,556
"derive" (i.e. with an extra join) being the operative word
__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon
Reply With Quote
  #11 (permalink)  
Old 09-30-07, 13:58
Thrasymachus Thrasymachus is offline
Button Pushing Slacker
 
Join Date: Nov 2004
Location: Arlington, VA
Posts: 5,696
Quote:
Originally Posted by r937
i love them, no there isn't, and no they shouldn't always

is it not marginally faster to join on numbers than on characters. numbers being the natural language of computers. do not get me wrong, I dig natural keys (unique constraints, indexing etc..), but on large tables I tend to take a surrogate.
__________________
you are the sum of your record collection

Last edited by Thrasymachus : 09-30-07 at 14:02.
Reply With Quote
  #12 (permalink)  
Old 09-30-07, 14:10
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 13,556
yeah, but when you get right down to it, characters are stored as numbers -- 0s and 1s

__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon
Reply With Quote
  #13 (permalink)  
Old 09-30-07, 19:39
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 9,291
So are blobs. Should we use those as primary keys?
__________________
If it's not practically useful, then it's practically useless.

blindman
http://sqlblindman.googlepages.com/main
Reply With Quote
  #14 (permalink)  
Old 09-30-07, 19:43
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 13,556
the issue was whether comparing character strings is inately faster than comparing integers

i wonder how comparing two blobs of length 4 stacks up to comparing two integers
__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon
Reply With Quote
  #15 (permalink)  
Old 09-30-07, 23:10
sco08y sco08y is offline
Registered User
 
Join Date: Oct 2002
Location: Fort Polk, LA
Posts: 500
Quote:
Originally Posted by blindman
You can already derive building from floors, so it is redundant to store building ID in the Rooms table.

Based on the way we do it where I work, I assumed that the first floor was 1 no matter what the building and that you couldn't derive the building from the floor. I still don't see how surrogate keys make administration easier than ON UPDATE CASCADE, though I admit that I always find myself surprised when it actually works. And I think I've just gotten used to writing SQL for composite keys.

Quote:
Originally Posted by thrasymachus
is it not marginally faster to join on numbers than on characters.

Most database ops are disk bound, not CPU bound. And most string libraries try to work with an entire machine word at once, anyway.

I'm surprised no one has thought of the very worst key: floating point numbers. For example, +0. != -0., and even if the column isn't NULLABLE, it can still be NaN.
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On