| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

09-06-10, 13:32
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 19
|
|
|
Using surrogate keys because of key length?
|
|
I know that the subject of surrogate vs natural keys envokes strong feelings here, but assuming that you prefer natural keys, have you sometimes had to use a surrogate anyway because of key length?
I'm looking at a design right now, where the natural key would be a descriptive name of up to 160 chars. I suppose I'm not the first one to come across such a situation.
I know that some systems force the user to make up a short arbitrary code in circumstances like this, but that's just shifting the responsibility for generating the surrogate key from the computer to the user.
|
|

09-06-10, 17:22
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
I'm a strong believer in using both a surrogate Primary Key, and a Unique Constraint for a Natural Key. They two are synonyms and need to be treated as such, but the surrogate key is often thousands of times easier for the code to use for any number of reasons.
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|

09-06-10, 18:27
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
|
|
Quote:
Originally Posted by Anna-J
I know that the subject of surrogate vs natural keys envokes strong feelings here, but assuming that you prefer natural keys, have you sometimes had to use a surrogate anyway because of key length?
I'm looking at a design right now, where the natural key would be a descriptive name of up to 160 chars. I suppose I'm not the first one to come across such a situation.
I know that some systems force the user to make up a short arbitrary code in circumstances like this, but that's just shifting the responsibility for generating the surrogate key from the computer to the user.
|
In a case where you have an unusually long string as the only potential identifier you should be asking some questions of the business users. How do they expect to identify the information represented by this table? Can they rely on differentiating two 160 character strings that differ only by one character? You may determine that it's more convenient and reliable to use or create a shorter identifier for them. That would not be a surrogate key. It would be a meaningful business key rather than a surrogate because it's used by the consumers of the data rather than being purely internal to the database or an application.
If you do want to make the 160 character column unique then a surrogate is probably a very good idea. Obviously the surrogate doesn't make the column in question unique, so then you would need two uniqueness constraints instead of one.
|
|

09-06-10, 18:37
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
In a lot of the applications that I build, the Natural Key is comprised of three or more columns, and can often have a maximum length much larger than 160 characters.
The following discussion is specifically targeted toward code generators, but it applies almost as well to applications using any standard development platform such as Windows Forms, MFC, etc. When using code generator techniques, a single column key is far easier to use than a multiple column key. It is also far simpler to use code generation tools if the Primary Key for all of the tables is the same data type.
As long as the Data Archictect and the developers clearly understand that the Surrogate Key and the Natural Key are synonyms, I've never had a problem with this technique.
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|

09-06-10, 21:42
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
Quote:
Originally Posted by Pat Phelan
As long as the Data Archictect and the developers clearly understand that the Surrogate Key and the Natural Key are synonyms, I've never had a problem with this technique.
|
By "synonym" in this case, do you mean that the surrogate and the natural key values always identify the same thing? That's not always true though. Either could change independently of each other. It's often a working assumption that surrogate key values will not be allowed to change at runtime but it could be a requirement to allow natural key values to be changed. Changing any key values amounts to the same thing: it changes the identification of the things being represented in the table.
|
|

09-07-10, 09:06
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
I mean synonym in the sense that the surrogate key and the natural key identify the same row. For the purpose of identifying the underlying "thing" that the row represents the natural key and the surrogate key are equivalent. The natural key might change value, but the surrogate key and the new value of the natural key would still identify the same row so they are still synonomous even though the value of the natural key is different.
The surrogate key exists in my mind purely to make it easier to construct and verify that solutions (applications) are correct, without requiring any change that is apparent to the SME (Subject Matter Expert). At least in my applications, adding a surrogate key is unnoticed by the user. I go to considerable lengths to keep the user from being aware that a surrogate key exists because I see the surrogate key as part of the solution instead of part of the problem to be solved.
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|

09-07-10, 12:28
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 19
|
|
Quote:
Originally Posted by Pat Phelan
When using code generator techniques, a single column key is far easier to use than a multiple column key. It is also far simpler to use code generation tools if the Primary Key for all of the tables is the same data type.
|
Could you give an example of this? I'd be grateful for one, since I have a few tables with 3 column PKs.
|
|

09-07-10, 21:35
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
Quote:
Originally Posted by Anna-J
Could you give an example of this?
|
Ummm... How?
Spin up a demonstration table with a surrogate key and three other columns. Mix and match the data types of the columns if you like real fun.
Declare the surrogate key as the Primary Key, then take your code generator and spin up a demo app to allow you to enter rows, edit rows, and delete rows using the surrogate key. Test to be sure that it works correctly. Store the generated code for a few minutes.
Remove the PK declaration, then declare the other columns as the Primary Key. Again, verify that the app works correctly. Compare the generated code with the previous version, specifically looking at the differences.
The difference should be small compared to the total application, but those differences should cause you to strongly prefer using the surrogate key especially if the columns are complex.
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|