If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Identifying versus non-identifying relationships

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Identifying versus non-identifying relationships

Hello everyone, I have a question that seems to be more a topic of debate than anything else.

An identifying relationship is one where the PK from the parent table is also *part of* the PK in the child table. As such:

T1
T1_ID

T2
T1_ID
T2_ID

I found this article at datawarehouse.com:

http://www.datawarehouse.com/article/?articleid=3094

This article says that identifying relationships are bad. Very bad at that because future changes to the database can become costly creating a ripple effect in that when you change the PK in one table you must also make that same change in the other tables.

I have been using non-identifying relationships exclusivley (Also on associative tables) but have been told this may be incorrect.

Can anyone tell me why using an identifying relationship is better than using a non-identifying relationship?

One thing that I do not like about using non-identifying relationships is that the data integrety is not enforced.

I would be interested in hearing both sides so that I can make a determination on how to proceed with a large design model.

Thanks,

Frank
Reply With Quote
  #2 (permalink)  
Old
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
Quote:
Originally Posted by fjm1967
One thing that I do not like about using non-identifying relationships is that the data integrety is not enforced.
this isn't true, how did you get this idea?

regarding changes to the primary key, what do you see as the pros and cons of allowing changes to the primary key?

once you identify something positively, why would it ever need to change its identity?

if you can choose a primary key that won't change, why would you need a non-identifying foreign key?
__________________
rudy.ca | @rudydotca
Buy my SitePoint book: Simply SQL
Reply With Quote
  #3 (permalink)  
Old
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 12,561
Quote:
Originally Posted by fjm1967
This article says that identifying relationships are bad. Very bad at that because future changes to the database can become costly creating a ripple effect in that when you change the PK in one table you must also make that same change in the other tables.
This is not really a big issue, despite how often you hear this opinion stated. How often do you change a primary key? Pretty rarely, and if you have to you can be pretty sure there are other larger changes that need to be made simultaneously that will make changing the primary key seem like peanuts.
How often do you change the value of a primary key? Rarely, even for natural keys, and almost never in a data warehouse environment. And when you do the cascading updates and deletes are not going to overwhelm your system unless your database is huge.
The real reason I avoid using what you call "identifying relationships" is that it leads to composite primary keys, with every relationship extended the key's size. Its easy to find yourself with a composite key that may have six or seven columns. This is really messy for coding, and so I use surrogate keys almost exclusively.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
www.LobsterShot.blogspot.com
Reply With Quote
  #4 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by r937
once you identify something positively, why would it ever need to change its identity?
Here is one reason, a direct quote from the datawarehouse.com article:

"In Part 1, I developed a case study based on a fragment of an Item Management data model. In that data model, identifying relationships were used. After the model was populated with about a year's worth of data, management decided to change the business rules related to standard and non- standard items. I concluded Part 1 by describing the changes to the data model and the database which were required by that change in business rules and how expensive those changes were to implement."

What I think the author is trying to impress upon the reader is the flexability a db should have with regard to future change.

Quote:
Originally Posted by r937
regarding changes to the primary key, what do you see as the pros and cons of allowing changes to the primary key?
I would have to say that a single pro would be to allow for future change in in the db when business rules change. A con would have to be that data integrety depends on never changing the PK.
Reply With Quote
  #5 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by blindman
This is not really a big issue, despite how often you hear this opinion stated. How often do you change a primary key? Pretty rarely, and if you have to you can be pretty sure there are other larger changes that need to be made simultaneously that will make changing the primary key seem like peanuts.
How often do you change the value of a primary key? Rarely, even for natural keys, and almost never in a data warehouse environment. And when you do the cascading updates and deletes are not going to overwhelm your system unless your database is huge.
The real reason I avoid using what you call "identifying relationships" is that it leads to composite primary keys, with every relationship extended the key's size. Its easy to find yourself with a composite key that may have six or seven columns. This is really messy for coding, and so I use surrogate keys almost exclusively.
I agree, a PK is rarely changed. I was looking at the possibility of a change in business rules that would force such a change within the database.

BTW: This question I am asking stems from the post that you made to Rudy on my last model where you told Rudy that you used surrogate keys almost exclusively and Rudy told you that he was trying to show me something and insisted that I use natural keys. It was over my head but here I have two database experts that disagree and I am trying to form my own opinion. I am in favor of giving everything a sequencing surrogate key to ensure future changes to the db if need be. I am still open to using the identifying relationships if it really isn't a big deal as you suggest.

I actually went back to Rudy's old post where he said:

"the profligate and unthinkingly wanton use of surrogate keys is the main reason we get so many people posting "woe is me! how do i remove all but one of the dupes in my table?"

Can you or Rudy please explain this?
Reply With Quote
  #6 (permalink)  
Old
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
Quote:
Originally Posted by fjm1967
A con would have to be that data integrety depends on never changing the PK.
you will need to do some more research, then

primary keys do change, and data integrity requires that the foreign keys also change

that's what ON UPDATE CASCADE covers

__________________
rudy.ca | @rudydotca
Buy my SitePoint book: Simply SQL
Reply With Quote
  #7 (permalink)  
Old
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
Quote:
Originally Posted by fjm1967
...but here I have two database experts that disagree
actually, we don't

i rarely use composite foreign keys either

Quote:
Originally Posted by fjm1967
"the profligate and unthinkingly wanton use of surrogate keys is the main reason we get so many people posting "woe is me! how do i remove all but one of the dupes in my table?"

Can you or Rudy please explain this?
be happy to

in the logical data modelling phase, there are no surrogate keys -- all keys are natural keys, and the concept of identity is firmly rooted in the concept of candidate keys (none of which, in the logical modelling phase, is a surrogate, but i'm starting to repeat myself)

here's a quote which just by coincidence has been posted on my site for the last couple of weeks:
When two entities are distinct there will always be some way of telling them apart. If there isn't, then logically they are the same entity, in which case there never were two entities and we were mistaken in ever thinking there was more than one.
if you want to get even more confused, read the entire article it comes from:Key Points About Surrogate Keys

the reason surrogate keys are so dangerous in the wrong hands is because a surrogate key, by itself, does not prevent you from entering the same row of data twice

blindman and i both know what we're doing, so we don't disagree at all
__________________
rudy.ca | @rudydotca
Buy my SitePoint book: Simply SQL
Reply With Quote
  #8 (permalink)  
Old
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 12,561
Quote:
After the model was populated with about a year's worth of data, management decided to change the business rules related to standard and non- standard items. I concluded Part 1 by describing the changes to the data model and the database which were required by that change in business rules and how expensive those changes were to implement.
There are really not enough details in this statement to make any comment. What as the change in the business rules? Why was it expensive to implement? Would it NOT have been expensive to implement if identifying relationships had been used? Why would a change to business rules NOW affect the organization of HISTORICAL data in a data warehouse?
Again, I am not a big fan of identifying relationships/natural keys, but this article seems to be parroting the most commonly cited grievances against them which I happen to think are usually non-issues.
Quote:
It was over my head but here I have two database experts that disagree and I am trying to form my own opinion.
Rule of thumb: when you get two conflicting pieces of advice on a forum, go with the poster who's login comes first alphabetically. The exception to this rule are all the posters whose logins start with "A", because they don't know squat.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
www.LobsterShot.blogspot.com
Reply With Quote
  #9 (permalink)  
Old
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,910
Quote:
Originally Posted by blindman
Rule of thumb: when you get two conflicting pieces of advice on a forum, go with the poster who's login comes first alphabetically. The exception to this rule are all the posters whose logins start with "A", because they don't know squat.
Code:
USE dbforums
GO
 
CREATE LOGIN '_pootle flump' WITH password = 'password'
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #10 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by pootle flump
Code:
USE dbforums
GO
 
CREATE LOGIN '_pootle flump' WITH password = 'password'
LOL. Thanks for the advice.

Actually, I respect both of these guys' opinions. I'm glad actually that Rudy said that he and blindman do not disagree afterall.
Reply With Quote
  #11 (permalink)  
Old
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 12,561
We are both still very much in love, despite what the tabloids say.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
www.LobsterShot.blogspot.com
Reply With Quote
  #12 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by blindman
We are both still very much in love, despite what the tabloids say.
LOL. And who said Ohio people don't have a great sense of humor!! Right on Blindman!
Reply With Quote
  #13 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by r937
you will need to do some more research, then

primary keys do change, and data integrity requires that the foreign keys also change

that's what ON UPDATE CASCADE covers

Ok, now I just don't understand this because I have read over and over again where it has been stated "Stability means that primary key values never change".

I have not had a chance to read on UPDATE CASCADE but I will.

Note: I have been trying to respond to this all day and have taken at least 20 calls between these three sentences.
Reply With Quote
  #14 (permalink)  
Old
Gives Bad Advice
 
Join Date: Mar 2007
Location: 010101010110100
Posts: 792
Quote:
Originally Posted by r937
actually, we don't
in the logical data modelling phase, there are no surrogate keys -- all keys are natural keys, and the concept of identity is firmly rooted in the concept of candidate keys (none of which, in the logical modelling phase, is a surrogate, but i'm starting to repeat myself)
I know I am wrong but I really never saw the advantage to starting off in the logical phase when I could just go right into the physical design stage. Is the reason because it forces one to look at the identy first before assigning a surrogate key?

Quote:
Originally Posted by r937
here's a quote which just by coincidence has been posted on my site for the last couple of weeks:
When two entities are distinct there will always be some way of telling them apart. If there isn't, then logically they are the same entity, in which case there never were two entities and we were mistaken in ever thinking there was more than one.
Don't laugh now Rudy, but this actually makes a lot of sense to me and doesn't confuse me at all. Pretty straightforward actually.


Quote:
Originally Posted by r937
the reason surrogate keys are so dangerous in the wrong hands is because a surrogate key, by itself, does not prevent you from entering the same row of data twice
This particular part of this post is of much interest to me because of what I stated in a previous post when I said that non-identifying relationships don't enforce data integrety. I really didn't know how to answer your question so I dared not go near it. But.. I think that you have hit the exact reason that I am so curious about identifying versus non-identifying relationships. I noticed that while using *non-identifying* relationships, I WAS able to enter duplicate data into the PK fields and thought this can't be right.. I dubbed that data integrety so maybe that answers your previous question above.

Is this the same type of data duplication you are talking about?
Reply With Quote
  #15 (permalink)  
Old
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
yes, i think you do understand

note, though, that you were not able to enter duplicate data into the PK fields -- because a PK never allows duplicates

what i think you meant to say was that you were able to enter duplicate data into an alternate or candidate key

and yeah, this is exactly the scenario -- people create a auto_increment as the PK, forget to declare a UNIQUE constraint on the "real" key, and end up with dupes
__________________
rudy.ca | @rudydotca
Buy my SitePoint book: Simply SQL
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On