If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > "id" attribute in db tables...

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-24-09, 16:15
qmqmqm qmqmqm is offline
Registered User
 
Join Date: Feb 2009
Posts: 5
"id" attribute in db tables...

Hi everyone,

I have a few tables in my database. For example users, packages, etc. And each table has a “name” attribute, so the “users” table has “user_name”, “packages” table has “package_name”, etc. is it a good idea to also have an “id” field for each table, to avoid data duplication when associating different tables?

Thanks,

Tom
Reply With Quote
  #2 (permalink)  
Old 02-24-09, 16:30
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
It is a good thing for every table to have at least one Candidate Key. The relational model requires candidate keys whereas in SQL they are optional (but still extremely important).

The term "ID field" is somewhat overloaded so it's hard to answer your question directly. I expect you are talking about surrogate keys. From a conceptual modelling perspective it's most important that every table has at least one natural key. If it doesn't then you should rethink your model. Whether you also add a surrogate key to your table is of secondary importance. There may be good reasons for doing so but unless you have identified such a reason then in my opinion it's better to leave it out. Reasonable people differ on that point however.

If I haven't answered your question then please clarify what you mean by "Id" in this instance.
Reply With Quote
  #3 (permalink)  
Old 02-24-09, 17:19
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by dportas
Reasonable people differ on that point however.
in this case i agree with you
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #4 (permalink)  
Old 02-24-09, 19:05
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
I've even heard of unreasonable people differing! Perish the thought.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #5 (permalink)  
Old 02-25-09, 10:31
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
Quote:
Originally Posted by qmqmqm
is it a good idea to also have an “id” field for each table, to avoid data duplication when associating different tables?
To paraphrase from a book I recently started reading, "adding an id field does not remove data duplication". You just "duplicate" numbers instead of text.

The "id" column is known as a surrogate key. It is a surrogate (i.e. used in place of) for the natural key (the names). You could try googling surrogate and natural keys if you want more info.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #6 (permalink)  
Old 02-25-09, 15:25
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
you started at the back?

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #7 (permalink)  
Old 02-25-09, 15:58
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Quote:
Originally Posted by pootle flump
To paraphrase from a book I recently started reading, "adding an id field does not remove data duplication". You just "duplicate" numbers instead of text.
How does adding an ID field duplicate numbers?
Just goes to show you can't believe everything you read, I guess.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #8 (permalink)  
Old 02-25-09, 23:06
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by blindman
How does adding an ID field duplicate numbers?
okay, here's an example

suppose you have a table of content entries for a content management system, where each content entry has its own PK, and let's assume this is an integer

id title
423 What If I Get Sick and Die?
524 Uncle Karl and the Gasoline
537 Be Nice to Everybody
573 Hello Statue
598 The Size of our Galaxy
605 Windows Media Center Rocks

and now suppose each content entry could have multiple keywords

there would be a separate table, entrykeywords, that would look something like this --

entry_id keyword
524 family
524 reckless
537 family
537 my three rules
598 astronomy
605 television
605 windows

many developers notice that the keywords are being repeated here -- obviously, it's a many-to-many relationship, as the same keyword can be assigned to multiple entries, while the same entry can have multiple keywords

so they decide that they will "eliminate" the "redundancy" by storing the keywords in their own table, and using a numeric id to relate the entries to the keywords

so then the keywords table would look like this --

id keyword
6 windows
7 television
9 family
11 reckless
13 astronomy
16 my three rules

and then the entrykeywords table would look like this --

entry_id keyword_id
524 9
524 11
537 9
537 16
598 13
605 7
605 6

and and now we've reached the point where that statement applies -- many database developers think that they are "eliminating" the "redundancy" of having the actual keywords repeated in the entrykeywords table, whereas in fact the same amount of redundancy is still there, except now they are integer values that are being repeated

further, note that with the surrogate key, we are forced to join to the keywords table to determine the keyword, an unnecessary extra step as compared with using the natural key, the keyword itself, in the entrykeywords table
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #9 (permalink)  
Old 02-26-09, 03:44
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
Quote:
Originally Posted by blindman
How does adding an ID field duplicate numbers?
It "duplicates" numbers in exactly the same way that without using IDs "duplicates" names. Note the quotes around the word "duplicates". I was using the OP's definition of duplication, and making it clear with my quote marks that I don't respect that definition.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #10 (permalink)  
Old 02-28-09, 11:20
qmqmqm qmqmqm is offline
Registered User
 
Join Date: Feb 2009
Posts: 5
Quote:
Originally Posted by pootle flump
It "duplicates" numbers in exactly the same way that without using IDs "duplicates" names. Note the quotes around the word "duplicates". I was using the OP's definition of duplication, and making it clear with my quote marks that I don't respect that definition.
Thank you very much guys. I read and studied all your replies. Yes indeed the id field is the surrogate key. For the "users" table that I talked about, "user name" field can serve as the primary key. However, in the "packages" table, the package name and a comment field are needed to uniquely identify a record. So in this table I guess it would be more efficient to introduce an "id" surrogate key?

Thanks,

Tom
Reply With Quote
  #11 (permalink)  
Old 02-28-09, 11:32
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
more efficient? no, not really

the deciding factor might be based on two things: how many tables are related to the users table via a foreign key? do you allow users to change their usernames?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #12 (permalink)  
Old 03-02-09, 15:49
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
Quote:
Originally Posted by r937
and and now we've reached the point where that statement applies -- many database developers think that they are "eliminating" the "redundancy" of having the actual keywords repeated in the entrykeywords table, whereas in fact the same amount of redundancy is still there, except now they are integer values that are being repeated
True. However, N keywords in the intersection table will take up lots more space than N IDs. Especially when you include the index
Quote:
Originally Posted by r937
further, note that with the surrogate key, we are forced to join to the keywords table to determine the keyword, an unnecessary extra step as compared with using the natural key, the keyword itself, in the entrykeywords table
It's a trade-off. Space versus time.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #13 (permalink)  
Old 03-02-09, 15:56
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
hey, these are not your dad's $200,000 98K disk platters -- today's storage is, what, a couple pennies per gigabyte?

space should be the ~last~ consideration in database design

simplicity ftw, eh
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #14 (permalink)  
Old 03-02-09, 16:05
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
And yet, depending upon the situation, space can be a constraint. And, while sata drive prices had plummeted, by the time you add raid controllers and redundant hardware, extra storage is still not free.

Likewise, joins can fairly efficient, especially with integer type indexes... These aren't your dad's 10 mHz XT processors, after all...

I will grant you - storage bang for the buck has increased faster than processing bang for the buck recently.
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert


Last edited by loquin; 03-02-09 at 16:10.
Reply With Quote
  #15 (permalink)  
Old 03-02-09, 18:45
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by loquin
These aren't your dad's 10 mHz XT processors, after all...
.
in other words, they're faster... which means that "premature optimizations" like using integers instead of natural keys is less important today

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On