| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |
|

02-24-09, 16:15
|
|
Registered User
|
|
Join Date: Feb 2009
Posts: 5
|
|
|
"id" attribute in db tables...
|
|
Hi everyone,
I have a few tables in my database. For example users, packages, etc. And each table has a “name” attribute, so the “users” table has “user_name”, “packages” table has “package_name”, etc. is it a good idea to also have an “id” field for each table, to avoid data duplication when associating different tables?
Thanks,
Tom
|
|

02-24-09, 16:30
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
It is a good thing for every table to have at least one Candidate Key. The relational model requires candidate keys whereas in SQL they are optional (but still extremely important).
The term "ID field" is somewhat overloaded so it's hard to answer your question directly. I expect you are talking about surrogate keys. From a conceptual modelling perspective it's most important that every table has at least one natural key. If it doesn't then you should rethink your model. Whether you also add a surrogate key to your table is of secondary importance. There may be good reasons for doing so but unless you have identified such a reason then in my opinion it's better to leave it out. Reasonable people differ on that point however.
If I haven't answered your question then please clarify what you mean by "Id" in this instance.
|
|

02-24-09, 17:19
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
|
|
Quote:
|
Originally Posted by dportas
Reasonable people differ on that point however.
|
in this case i agree with you 
|
|

02-24-09, 19:05
|
|
Super Moderator
|
|
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
|
|
I've even heard of unreasonable people differing! Perish the thought. 
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert
|
|

02-25-09, 10:31
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
Quote:
|
Originally Posted by qmqmqm
is it a good idea to also have an “id” field for each table, to avoid data duplication when associating different tables?
|
To paraphrase from a book I recently started reading, "adding an id field does not remove data duplication". You just "duplicate" numbers instead of text.
The "id" column is known as a surrogate key. It is a surrogate (i.e. used in place of) for the natural key (the names). You could try googling surrogate and natural keys if you want more info.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

02-25-09, 15:25
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
you started at the back?

|
|

02-25-09, 15:58
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
Quote:
|
Originally Posted by pootle flump
To paraphrase from a book I recently started reading, "adding an id field does not remove data duplication". You just "duplicate" numbers instead of text.
|
How does adding an ID field duplicate numbers?
Just goes to show you can't believe everything you read, I guess.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

02-25-09, 23:06
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by blindman
How does adding an ID field duplicate numbers?
|
okay, here's an example
suppose you have a table of content entries for a content management system, where each content entry has its own PK, and let's assume this is an integer
id title
423 What If I Get Sick and Die?
524 Uncle Karl and the Gasoline
537 Be Nice to Everybody
573 Hello Statue
598 The Size of our Galaxy
605 Windows Media Center Rocks
and now suppose each content entry could have multiple keywords
there would be a separate table, entrykeywords, that would look something like this --
entry_id keyword
524 family
524 reckless
537 family
537 my three rules
598 astronomy
605 television
605 windows
many developers notice that the keywords are being repeated here -- obviously, it's a many-to-many relationship, as the same keyword can be assigned to multiple entries, while the same entry can have multiple keywords
so they decide that they will "eliminate" the "redundancy" by storing the keywords in their own table, and using a numeric id to relate the entries to the keywords
so then the keywords table would look like this --
id keyword
6 windows
7 television
9 family
11 reckless
13 astronomy
16 my three rules
and then the entrykeywords table would look like this --
entry_id keyword_id
524 9
524 11
537 9
537 16
598 13
605 7
605 6
and and now we've reached the point where that statement applies -- many database developers think that they are "eliminating" the "redundancy" of having the actual keywords repeated in the entrykeywords table, whereas in fact the same amount of redundancy is still there, except now they are integer values that are being repeated
further, note that with the surrogate key, we are forced to join to the keywords table to determine the keyword, an unnecessary extra step as compared with using the natural key, the keyword itself, in the entrykeywords table
|
|

02-26-09, 03:44
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
Quote:
|
Originally Posted by blindman
How does adding an ID field duplicate numbers?
|
It "duplicates" numbers in exactly the same way that without using IDs "duplicates" names. Note the quotes around the word "duplicates". I was using the OP's definition of duplication, and making it clear with my quote marks that I don't respect that definition.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

02-28-09, 11:20
|
|
Registered User
|
|
Join Date: Feb 2009
Posts: 5
|
|
Quote:
|
Originally Posted by pootle flump
It "duplicates" numbers in exactly the same way that without using IDs "duplicates" names. Note the quotes around the word "duplicates". I was using the OP's definition of duplication, and making it clear with my quote marks that I don't respect that definition.
|
Thank you very much guys. I read and studied all your replies. Yes indeed the id field is the surrogate key. For the "users" table that I talked about, "user name" field can serve as the primary key. However, in the "packages" table, the package name and a comment field are needed to uniquely identify a record. So in this table I guess it would be more efficient to introduce an "id" surrogate key?
Thanks,
Tom
|
|

02-28-09, 11:32
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
more efficient? no, not really
the deciding factor might be based on two things: how many tables are related to the users table via a foreign key? do you allow users to change their usernames?
|
|

03-02-09, 15:49
|
|
Super Moderator
|
|
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
|
|
Quote:
|
Originally Posted by r937
and and now we've reached the point where that statement applies -- many database developers think that they are "eliminating" the "redundancy" of having the actual keywords repeated in the entrykeywords table, whereas in fact the same amount of redundancy is still there, except now they are integer values that are being repeated
|
True. However, N keywords in the intersection table will take up lots more space than N IDs. Especially when you include the index
Quote:
|
Originally Posted by r937
further, note that with the surrogate key, we are forced to join to the keywords table to determine the keyword, an unnecessary extra step as compared with using the natural key, the keyword itself, in the entrykeywords table
|
It's a trade-off. Space versus time.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert
|
|

03-02-09, 15:56
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
hey, these are not your dad's $200,000 98K disk platters -- today's storage is, what, a couple pennies per gigabyte?
space should be the ~last~ consideration in database design
simplicity ftw, eh
|
|

03-02-09, 16:05
|
|
Super Moderator
|
|
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
|
|
And yet, depending upon the situation, space can be a constraint. And, while sata drive prices had plummeted, by the time you add raid controllers and redundant hardware, extra storage is still not free.
Likewise, joins can fairly efficient, especially with integer type indexes... These aren't your dad's 10 mHz XT processors, after all...
I will grant you - storage bang for the buck has increased faster than processing bang for the buck recently.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert
|
Last edited by loquin; 03-02-09 at 16:10.
|

03-02-09, 18:45
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by loquin
These aren't your dad's 10 mHz XT processors, after all... 
.
|
in other words, they're faster... which means that "premature optimizations" like using integers instead of natural keys is less important today

|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|