| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

11-28-07, 15:01
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
Noob question - I keep getting conflicting answers
|
|
Hi, i am pretty new to databases, and have been getting conflicting information about this question.
if i have 2 tables, people, and hobbies, where a person can have multiple hobbies. The person enters their name, and can then enter in any number of hobbies. So, Person 2 enters their name, and hobby, which is "tennis". Should i just add Tennis to the hobbies table, or should i check to see if its there already, and if it is, reference that.
Does the answer depend on whether i need to find people base on a hobby?
I was lead to believe that duplicate data was bad, but surly its going to be a lot quicker if i just add the hobby to the database, and dont have to check if its already there. And it also becomes more complicated should i need to delete the person, and their hobbies.
Thankyou
|
|

11-28-07, 15:10
|
|
Registered User
|
|
Join Date: May 2005
Location: San Antonio, Texas
Posts: 134
|
|
ask yourself this... Is the following meaningful:
amthomas tennis
amthomas tennis
amthomas tennis
amthomas tennis
amthomas tennis
you should do meaningful things. everything else is just going to be clutter that gets in the way at some point.
__________________
Vi veri veniversum vivus vici
By the power of truth, I, a living man, have conquered the universe
|
|

11-28-07, 15:19
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
|
|
Quote:
|
Originally Posted by amthomas
ask yourself this... Is the following meaningful:
amthomas tennis
amthomas tennis
amthomas tennis
amthomas tennis
amthomas tennis
|
...depends on what you've been smoking, and whether there is a lava lamp in the room.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

11-28-07, 15:27
|
|
Registered User
|
|
Join Date: May 2005
Location: San Antonio, Texas
Posts: 134
|
|
:P whoa... like...whoa man.
__________________
Vi veri veniversum vivus vici
By the power of truth, I, a living man, have conquered the universe
|
|

11-28-07, 16:08
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
|
thanks for replies, but....
Thanks for you replies
but what about:
oleat tennis
amthomas tennis
blindman tennis
so your saying avoid duplication, even if it potentially makes things quicker. e.g deleting a person, without having to check if their hobbies are shared with anyone else.
|
|

11-28-07, 16:24
|
|
Registered User
|
|
Join Date: May 2005
Location: San Antonio, Texas
Posts: 134
|
|
these all look like reasonable entries to me:
oleat tennis
amthomas tennis
blindman tennis
as far as duplication, it only seems to make things quicker on the programming side. it really isn't beneficial.
although it is hard to see in the simple case like this since and update would make changes to all instances, you are breaking data integrity.
amthomas tennis
amthomas tennis
amthomas tennis
could be changed to
amthomas tennis
amthomas tennis
amthomas soccer
although you meant to get rid of tennis and put in soccer instead.
now your db is lying to you and basically useless.
I am not sure what you mean by 'deleting a person....'
You could delete a person without checking hobbies if you wanted to at any time. If you had foreign keys that cascade then you could have it remove all associated entries.
__________________
Vi veri veniversum vivus vici
By the power of truth, I, a living man, have conquered the universe
|
|

11-28-07, 17:16
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
Thanks amThomas very much, i dont think i have made my problem very clear - my fault.
Here is an image of the 2 options as i see it:
http://upload.imgspot.com/u/07/331/13/thing.jpg
(The data is entered on a form by a user, and then uploaded.)
Version 1 has duplicate text data, but its easy and quick to add new hobbies for people. It is also easy and quick to delete a person, and all their hobbies.
Version 2 has no repeating data, but every time someone uploads a new hobby, i have to check to see if its been added previously, if so, i can reference it. Also if i were to try and delete a person, i would have to check all their hobbies to see if they were referenced by any other person, and only then could i safely delete it.
Thats how it appears to me, but i think i may be getting confused. I dont know which is the best solution.
|
|

11-28-07, 22:46
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
in my opinion, this is all you need --
Code:
create table people
( id integer not null primary key
, name varchar(99) not null
);
create table personhobbies
( person_id integer not null
, hobby varchar(99) not null
, primary key ( person_id, hobby )
, index (hobby, person_id)
);
notice the personhobbies primary key, which is the mechanism which automatically prevents this --
amthomas tennis
amthomas tennis
amthomas tennis
amthomas tennis
the personhobbies table above is actually a relationship table, but the hobbies table doesn't actually exist
the only time you'd go to the 3-table scheme with an actual hobbies table is if you were interested in controlling which hobbies they were allowed to enter
if you want them to enter whatever they want, then you don't need the hobbies table
as for deletions, you would never need to check whether other people had the hobbies of the person you're deleting, in neither the 2-table or 3-table scheme
so, did this answer conflict with anything you've been told previoulsy?

|
|

11-29-07, 02:50
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
Thanks for the concise answer r937, it makes sense to me. It does conflict with other recommendations i have had, i was told that all duplication should be removed, hence the 3 table structure, but am beginning to think that is overkill.
|
|

11-29-07, 06:55
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
"all duplication should be removed" is faulty advice anyway
if you replace "tennis" in the relationship table with an integer foreign key to the hobbies table, the integer values will be duplicated exactly as often as the keyword was

|
|

11-29-07, 11:33
|
|
Registered User
|
|
Join Date: May 2005
Location: San Antonio, Texas
Posts: 134
|
|
hehe, I guess the word duplication can be vague
sorry.. I didn't realize you were worried about foreign key duplication instead of record duplication.
I had the table given above in mind, but when you stated that it would be a lot quicker I though you were trying to avoid checking code or exception catching code. The table will give you an error if you try to input duplicates that violate the unique constraint. which is good!
Now, I am just stating this as a preference... I usually would go ahead and make the join table since I usually don't just have a single attribute like tennis. And later if you wanted to add more attributes the structure already exists and you just add to the hobbies table. For example: you want to add a description to each hobby. These things do not need to be done for simple projects, and, as r937 pointed out, since the attribute is the only primary key/data that makes up the foreign key then you can use it just fine 
__________________
Vi veri veniversum vivus vici
By the power of truth, I, a living man, have conquered the universe
|
|

11-29-07, 17:06
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
Thanks gain for your replies.
r937 - i had never thought of it like that before.
amthomas - You say the three table method allows you to add more attributes, but surly the 2 table design allows for that too?
I think i can finally articulate my confusion in this statement:
What rules govern whether you should check a table for an existing, matching record before insertion and use that, or add a new record each time.
|
|

11-29-07, 18:44
|
|
Registered User
|
|
Join Date: May 2005
Location: San Antonio, Texas
Posts: 134
|
|
the 'two table design' is not normalized if you add more hobby information. I will give an example:
you might want to have: username (FK), hobby, hobby_description
amthomas tennis "yellow ball racket whacking"
oleat tennis "yellow ball racket whacking"
r937 tennis "yellow ball racket whacking"
Now you have repeating dependencies. A Hobby description is not directly dependent on the primary key, which is (username, hobby).
There is also a side issue of misspelling  What if tenNis were entered accidentally once or some other variation?
I only mention these since I believe you may be curious. I do not know your circumstances, but they are probably issues you do not need to worry about.
when you ask:
Quote:
|
What rules govern whether you should check a table for an existing, matching record before insertion and use that, or add a new record each time.
|
then I assume you mean adding something like:
athomas tennis
r937 tennis
oleat tennis
and you know that you want (athomas, tennis) to be in that table.
If you don't already know it is in there then you just don't know.  You can check but even if it isn't there then it is possible that it gets added between the time you look and try to insert. Your best choice is just to insert and if a unique constraint violation is thrown then you know that it already exists.
The only way to be sure without an exception (that I can think of right now.. others may have more info) would be to lock the whole table (usually you don't want to do this) then check for the entry and if it does not exist insert it. This would all have to be in one transaction.
Please let me know if that isn't what you are asking here.. I just got out of a meeting and I am a little fuzzy  heh.
[EDIT] oops.. had two table designs in my head at once :P marked hobby as a FK by accident.
__________________
Vi veri veniversum vivus vici
By the power of truth, I, a living man, have conquered the universe
|
|

11-29-07, 18:58
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by amthomas
marked hobby as a FK by accident.
|
well, it is, sort of -- a virtual FK to a PK that doen't exist because in the 2-table model the hobbies table isn't there
nice post, amthomas -- nice explanations
|
|

11-30-07, 03:15
|
|
Registered User
|
|
Join Date: Nov 2007
Posts: 6
|
|
Thanks very much guy's, i think ive got it, thanks for being so patient with me, i know it must be frustrating for you. This is the line that sorted it for me, i think:
If you don't already know it is in there then you just don't know. You can check but even if it isn't there then it is possible that it gets added between the time you look and try to insert. Your best choice is just to insert and if a unique constraint violation is thrown then you know that it already exists.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|