This post seems (ok, ok, ...it is) huge but it is very clear (I hope at least) so, don't fear to read it :-)
I'm quite new to software developing and in these last months i drilled down into database design theory in order "to do things properly". I'm now designing an application for managing personal investments and i would like to discuss with you some aspect of the database design.
The "business rules" i intend to use are:
1. each investor can have 0, 1 or several portfolios (with portfolio i mean a group of investments) and each portfolio MUST belong to one and only one investor
2. each portfolio can have 0, 1 or several investments and each investment MUST belong to one and only one portfolio
3. an investment (for example an investment fund) can have 0, 1 or several events (events are every kind of money movement regarding the investment) and each event MUST belong to one and only one investment
4. an investment, MUST belong to one and only one investment type (no matter now what they are)
5. an event, MUST belong to one and only one event type (no matter now what they are)
I think the model is a working one but there are some subtle issues i'd like discuss here:
1. i made the relationships investor-portfolio and portfolio-investment a non-identifying ones because the entity involved are independent (i mean, they have an autonomous meaning) and i made them non-optional. i made the investment-event relationship an identifying one because the event entity has no meaning without an investment to refer to. Let's focus, for simplicity, on the investor-portfolio (call it I-P) and investment-event (call it I-E) relationships (the portfolio-investment is the same as investor-portfolio). This is a very clear conceptual difference, but, from a practical point of view, what are the advantage of making the I-E relationship an identifying one? in other words, what if i make it a non-identifying and non-optional one? Similarly, what if i make the I-P relationship an identifying one? Both relationships have the same cardinality, (namely 1 to 0,1,....)
so....are there practical differences? Am I loosing time with these subtleties or are they important points? let me know, please. Any kind of observation will be appreciated
2. another point is primary keys. I have used everywhere surrogate keys but i know that it would be better using natural keys and that if there are no natural keys i had better reviewing the model... so...let's see it entity by entity:
- investor: we all know that person names are not good PK so....i could use something like driving licence number but it is an application for home use, not a professional one, so i want keep it as frendly as possible an not too formal
- portfolio: the column Name is not unique. i don't want compell user to choose different names for each portfolio. I can have two investor with a portfolio with the same name so...(observation: if i had used an identifying relationship for investor-portfolio, the PK of portfolio could be composed by the columns investorId and portfolioName, and i could eliminate the surrogate key portfolioId but...is this a good reason for an identifying relationship? this sounds like question number 1 )
- investment: quite same as portfolio
- event: definitely, event instances have no natural key! ...at least i think, so....
- (not interested for now to discuss investment type and event type)
Do you have suggestions for natural keys?
So..., i'm sorry for this huge post but i wanted to explain properly everything (at least i tryed ) and i say thank you to all people who take the time to read it! Any opinion and criticism about all the stuff upon are welcome.
I agree with you.
In fact, first I thought to make the investor-portfolio and portfolio-investment relationships a identifying ones for exactly the same reasons you stated above, but then I remembered the "theory rule" stating that identifiying relationships are only for entities that have no autonomous meaning. Here, Investor, Portfolio and Investment are perfectly independent entitities on their own (from a conceptual point of view) and is only for the fact that i made the relation non-optional (...each portfolio MUST belong to one and only one investor...) that they become dependent and the relationship could be considered as a identifying one. I know...it is a tricky point; it is almost a "philosophical" matter, i.e. : have i to consider the indipendence of the entities from an abstract and theoric point of view (..investor is a person and a portfolio is like a plastic box on a desk, so they are independent) or from the point of view of my business rules (according to them, the entities involved ARE dependent)?
The core point of my post is just that! infact, the question was about the PRACTICAL implications of using one point of view or the other. Is better using the "philosophical" point of view and use non-optional non-identifying relationship or using the practical point of view and use the identifying relationship with the payload of having multicolumn PKs? i ask that because, in my opinion, technically, non-optional non-identifying relationships and identifying relationship differs only for the composition of PKs (the latter are multicolumns); am i right?
PS. I hope not to appear too much verbose...but i think it's better to be verbose and clear than to be short and give way to misunderstandings :-))
can an investor's portfolio ever be "transferred" to another investor? if yes, would it still be the same portfolio? even if it had the same investments in it? what if an investor went away, would the portfolio still exist in your database? or would you truly implement ON DELETE CASCADE?
the question of identity is at the very heart of data modelling
i'm not going to tell you about investors and portfolios and so on -- that's your job, you're the SME (subject matter expert)
i just know that for me, they'd probably start out as identifying relationships until i uncovered a reason why they should have their own PKs
-Portfolios aren't to be transferred from an investor to another.
-Investments never change the portfolio they belong to.
-If an investor is deleted, all his stuff is to be deleted. The same for portfolio deleting.
I don't want the application manage complicated scenarios...it is only for personal use, not professional.
So...ok, i agree. i changed the relationship into identifying ones.
But now, i have big structured PKs. This is not nice to manage in
coding, but i will accept this payload.
But the core question still is there: what i will gain making this choice? or, in other words, what problems i will face if i leave the relationships as non-identifying ones? This is what i would like to have clear in my mind.
PS: Thank you very much Rudy for your previous replies! (I didn't it
what problems i will face if i leave the relationships as non-identifying ones? This is what i would like to have clear in my mind.
you will face the same types of problem -- ensuring that the application functions in a way that the realted entities behave according to the rules, such as investments never change the portfolio they belong to, etc.