| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |
|

03-05-08, 19:51
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 52
|
|
Lookup tables (aka domain/reference/code tables) surrogate vs natural keys?
|
|
I know the surrogate vs. natural key debate has gone on for a long time, but I was wondering about it specifically in the context of lookup tables. I have often created lookup tables like the following:
Code:
status_id status
--- ---
1 pending
2 cancelled
3 completed
And in most queries involving a table that has a foreign key to the lookup table then you must do a join to get text value in the query result.
But now I'm starting to wonder what the point of using a surrogate key here is. So much so that I've stopped using them. So the lookup table becomes simply:
Code:
status
---
pending
cancelled
completed
So now the purpose of the lookup table primarily becomes one of ensuring domain integrity. And any changes to the records in the lookup table can be propagated via cascading updates.
Thoughts? What is the purpose of using a surrogate key for lookup tables anyway?
|
|

03-05-08, 22:20
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
this is a brilliant realization on your part
i agree (and have done for a long time), the surrogate key in this situation seems silly
back in the day, when disk space was very scarce, it might have made sense, but nowadays you can get 200gigabytes for the price of a samwidge
(speaking of samwidges, i is hungry)
do not forget to declare the FK (a very important part of using a lookup table, whether with a natural key or a surrogate)
|
|

03-06-08, 01:41
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
|
|
There is no such thing as a lookup table.
If the data has its own table, then it is a data element, not a data attribute.
To take your own example, you may start out with a simple "lookup table":
create table Status(Status varchar(50))
All well and good and simple. Then you populate some data:
pending
cancelled
completed
Great! Then, oops...the client wants more status settings. So you add:
deleted
overridden
on hold
needs certification
No problem. But which ones represent states that are "in progress"? Which ones represent states that are no longer valid? Are you going to hard-code the text values in your queries? Better to add some flags, in case the text changes:
create table Status(Status varchar(50), InProgress bit, Invalid bit)
Is it still just a "lookup table"? I don't think so. And if your standard tables use surrogate keys, then for consistency this one should too.
There is no such thing as a lookup table. Treat them the same as your other tables.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-06-08, 03:26
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 52
|
|
Quote:
|
Originally Posted by r937
this is a brilliant realization on your part
|
Thanks (I'll take it where I can get it)
Quote:
|
Originally Posted by blindman
There is no such thing as a lookup table.
|
I've heard this sentiment before, and from a purity standpoint, I agree. And I am open to be convinced of another way of looking at things. But you can't deny that we have the terminology "lookup table" (in fact several variations thereof, as in the title of this post) which suggests that there is a conceptual entity backing the use of the term. There are articles all over the web that discuss them. And, to me, that suggests that, somehow, lookup tables are different from other tables, even if there is no fundamental difference and the only difference is how we look at them.
I think there is a common use-case where we need an attribute in a table (or set of tables) that is restricted to some set of values. And I would argue that, frequently, the lookup table's schema will remain stable over time.
Quote:
|
Originally Posted by blindman
And if your standard tables use surrogate keys, then for consistency this one should too.
|
Should the surrogate key/natural key decision be made at the database level or at the table level? I think I'm comfortable using surrogate keys for some tables and natural keys for others. Indeed, since I've started using natural keys for lookup tables that's what I've been doing, although admittedly I'm not sure what the ramifications of this new style will be.
In the case of the requirements changing, since the schema is going to be changing anyway, is there a conflict if what we once considered a lookup table is now no longer a lookup table because the requirements now deem that that table has some expanded function? I mean, maybe the client's requirements change a little more than in your example and you need a different table all together or two tables?
|
|

03-06-08, 04:08
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
I prefer natural keys simply because it makes the data easier to understand and the resultant SQL easier to read & write. It's a pain having to join to umpteen tables just to read what should be simple values from the database. Also having more complicated SQL means it's going to take longer to write the code and more likely to contain errors. Just my 2c.
|
|

03-06-08, 09:05
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
Quote:
|
Originally Posted by tmpuzer
But you can't deny that we have the terminology "lookup table" (in fact several variations thereof, as in the title of this post) which suggests that there is a conceptual entity backing the use of the term.
|
We also have the term "Compassionate Conservative", but that doesn't mean it actually exists either.
Quote:
|
Originally Posted by tmpuzer
Should the surrogate key/natural key decision be made at the database level or at the table level? I think I'm comfortable using surrogate keys for some tables and natural keys for others.
|
Consistency has value.
Quote:
|
Originally Posted by mike_bike_kite
I prefer natural keys simply because it makes the data easier to understand and the resultant SQL easier to read & write. It's a pain having to join to umpteen tables just to read what should be simple values from the database. Also having more complicated SQL means it's going to take longer to write the code and more likely to contain errors. Just my 2c.
|
Huh? Surrogate keys lead to simpler SQL in any moderately complex database.
You want to limit yourself to a Star Schema? Go ahead and use natural keys.
Are you going to have three, four, five or more data levels? Avoid the composite keys and use surrogates.
I used to use natural keys and surrogate keys interchangeably. But eventually I came to the realization that though I had many times needed to convert a table from a natural key to a surrogate key, I HAVE NEVER said to myself "Gee, I wish I had not used a surrogate key for that table."
My 3c.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-06-08, 09:12
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by blindman
Consistency has value.
|
but " Foolish consistency [my emphasis] is the hobgoblin of small minds." (Ralph Waldo Emerson)
|
|

03-06-08, 09:18
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
I like to go au natural
.... apart from when I shouldn't - then I don't.
And naturals do make the SQL easier\ shorter. I don't know how they would make it more complex other than you are thinking joins on composite columns.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

03-06-08, 10:29
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
Quote:
Originally posted by blindman
Surrogate keys lead to simpler SQL in any moderately complex database.
|
I just don't see that - perhaps all the databases I've worked with have just been too simple. Using the example above we get the following options for some simple SQL
Code:
-- confusing and prone to errors
select t.id
from MyTab t
where t.status_id in ( 2,3 )
-- long winded
select t.id
from MyTab t,
Status s
where s.status_id = t.status_id
and s.status in ( 'cancelled', 'completed' )
-- natural and simple
select t.id
from MyTab t
where t.status in ( 'cancelled', 'completed' )
The last example is the natural key one - I know what I prefer. Could you explain the argument against the natural key in this case.
|
|

03-06-08, 12:26
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
"from MyTab t, Status s"
Do you pet your puppy with the same hands you use to write this code? Joins, please.
"where t.status in ( 'cancelled', 'completed' )"
You are hard-coding values, which is sloppy design.
And yes, your "Natural and simple" falls apart when it comes to composite keys. Though if your example is as complex as your designs ever get, then I guess natural keys will suffice.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-06-08, 12:37
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by blindman
"where t.status in ( 'cancelled', 'completed' )"
You are hard-coding values, which is sloppy design.
|
horsesh1t
would you kindly explain how to pull out rows corresponding to two (out of several) statuses without coding into the query which two
|
|

03-06-08, 12:46
|
|
vaguely human
|
|
Join Date: Jun 2007
Location: London
Posts: 2,519
|
|
The bank's back office database (not one of mine) that I have in front of me uses surrogate keys for everything - I could imagine the blindman being very pleased with this one - but then again it also stores much of it's data as parameters (read EAV) so perhaps not. Sadly even the parameter names are surrogate keys so I have to look up a parameter's id, before looking up the current value of that parameter (another id), then finally look up the value of this new id to get the true value of the parameter! All these stages have a complete history in the database so each stage is just that little bit more complex.
It's one of those infinitely extendable databases that is so complex that no-one dares alter it in the slightest. Sadly there aren't functions to read / translate these values so even the simplest code quickly gets frightening. Give me simplicity any day.
|
|

03-06-08, 13:01
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
Quote:
|
Originally Posted by r937
horsesh1t
would you kindly explain how to pull out rows corresponding to two (out of several) statuses without coding into the query which two
|
Yes! The fisticuffs begin! Finally a worthy opponent...
Rudy, I would use bit values, as in my example above.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-06-08, 13:08
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
which bit values, pls?
and how do you specify them without coding them into the query?
|
|

03-06-08, 13:26
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
Why, you insert the bit values into a #temp table, of course!
-PatP
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|