Page 1 of 3 123 LastLast
Results 1 to 15 of 45
  1. #1
    Join Date
    Jul 2008
    Posts
    32

    Data Modeling Question

    Hello, I am new here and not sure if this is the correct forum to post this.

    This is a Data Modeling question.

    It is for a personal finance system I've been wanting to build for the past 5 years. My goal is to develop a platform-independent solution, although I'll likely use PHP/MySQL for implementation.

    (BTW, I chose this forum because I feel Oracle Developers are some of the most knowledgeable and enterprise-focused!)

    The gist of my system is (1) Expense has (m) ExpenseDetails, where "expenses" represent anything I spend money on, including Groceries, Clothing, Fuel, Utilities, Rent, Medical, Travel, etc.

    Experience shows me that regardless of the "category", most receipts/expenses have these fields...

    tblExpense (all receipts)
    ---------------------------
    ExpenseID
    ExpenseDate
    MerchantName
    ExpenseTotal
    PaymentInfo


    However, a problem arises - from a data modeling standpoint - with "Utilities", "Travel-related", "Insurance" and other types of expenses.

    For these "categories"/"sub-categories", there are lots of other important fields that should be captured under tblExpense, for example...

    tblExpense (fuel only)
    ----------------------
    FuelType
    TotalGallons
    CostPerGallon


    tblExpense (electricity only)
    ----------------------------
    DeliveredTo
    RateClass
    MeterReading_Start
    MeterReading_End
    CostPerKWH


    tblExpense (telephone only)
    ------------------------------
    BillingStartDate
    BillingEndDate
    TotalLocalCharges
    TotalLongDistanceCharges


    tblExpense (auto ins only)
    ----------------------------
    BillingStartDate
    BillingEndDate
    CoverageType
    InstallmentFee
    LawInforcementSurcharge

    and the list goes on...

    (While some people might try to cram this disparate data into generic fields, that would be the WRONG approach. A "Quantity" of 5 Apples is NOT the same thing as buying 5.27 Gallons of Gas!!)

    While hard to show in just text, conceptually this is what I believe is know a Super-Type/Sub-Type problem. And unfortunately, simplier databases like MS Access - which I'm using to prototype this!! - aren't designed to easily handle this Data Construct.

    How to handle this predicament?! ???

    SCENARIO #1: If I create on monster tblExpense and include all fields, then I solve the issue of missing any important information that needs to be captured, however, I will ultimately have a large table with lots of empty cells.

    SCENARIO #2: I could create a (super-type) tblExpense with a primary key "ExpenseID" and the common fields described earlier. Then I could create (sub-type) tables, e.g. tblFuel, tblElectric, tblTelephone, and tblAutoInsurance which also have a primary key "ExpenseID" and the additional respective fields in each. Then I suppose I could somehow manage the PK's and synch everything up?! ??? (I believe this is one reasonable path to follow, but I would definitely need "hand-holding" to properly implement this!)

    SCENARIO #3: One "un-informed" (and rather pompous) MS Access know-it-all I was being lectured by said I needed to create a tblExpense, tblExpenseAttribute, and tblExpenseAttributeType which would form a M-to-M relationship. After some thought, this might logically work, but I think it goes for "logical eloquence" OVER "implementation practicality". (Remember, I have to build forms and queries and logic to support my back-end design. And, to me, it would be very confusing to store things like "FuelType", "MeterReading_Start", "TotalLocalCharges", and "CoverageType" all in one table as his model would demand.

    SCENARIO #4: Be a wimp, and build a seperate system for each Expense-Type because what I am capturing is too disparate and therefore should have its own database/home.

    SCENARIO #5: Stop being so "anal-retentive" and just capture "ExpenseID", "ExpenseDate", "TotalAmount" and "Category" and be happy!


    ================================================== =================
    **NOTE: A similar problem exists with tblExpenseDetails!!

    The data seems to fall into Retail and Non-Retail buckets.

    With the first, you will find classic classroom "Order Details" type fields... "OrderItemDescirption", "UnitPrice", "Quantity", etc.

    With the second group - which is usually a Utility - you won't find ExpenseDetails because you don't buy Gas a gallon-at-a-time, or insurance in seperate parts. And while there might be Order SubCategories like "TotalLocalService" and "TotalLongDistance", in the end, everything relates back to tblExpense - and not tblExpenseDetail - because you are buying the product/service in totality - usually for the month - if you can follow that?!
    ================================================== =================


    In closing, what seemed like a very straight-forward system to build, is actually much more complicated when you look at the "big picture". At the same time, this isn't rocket science, and I am CERTAIN that I can build an intelligent, detailed, robust, and scalable system that meets MY NEEDS if I can just get a little help on the Data Modeling portion! ;D

    ** Hell of a first post, eh?! **

    Sincerely,


    Just Bob

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    nice first post, yeah

    i would go for scenario #1 -- what's wrong with empty cells?

    #2 is good, but as you anticipated, it is a lot more complex (and why do you need that complexity on your first database?)

    #3 is called EAV (entity-attribute-value) and you should avoid it, full stop

    my advice is to build a single table and load it with your data

    that experience alone is worth the effort

    feel free to come back and ask more questions if you need to

    by the way, i don't do oracle
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    If you want it done quickly & simply, make wide (many column) tables & ignore "wasted" columns.
    If you want it done correctly, research & utilize Third Normal Form.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  4. #4
    Join Date
    Jul 2008
    Posts
    32
    Quote Originally Posted by anacedent
    If you want it done quickly & simply, make wide (many column) tables & ignore "wasted" columns.
    If you want it done correctly, research & utilize Third Normal Form.
    I understand 3rd Normal Form. Do you?

    My post has nothing to do with 3NF - it has to do with Super/SubType data.

    Putting all Expense fields in one table breaks no Normalization rules, but I felt that maybe using multiple tables as in Scenario #2 is a better design.


    Just Bob

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Bob Just
    maybe using multiple tables as in Scenario #2 is a better design.
    that's a really big maybe

    for your app, it's likely overkill
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Jul 2008
    Posts
    32
    >nice first post, yeah

    Thank you!


    > i would go for scenario #1 -- what's wrong with empty cells?

    Well, it would work, and doesn't technically break any Normalization Rules, but it would get sloppy over time.

    If only 5% of my Expenses are Gas Receipts, then for fields like "TotalGallons" and "FuelType", that is A LOT of empty fields.


    > #2 is good, but as you anticipated, it is a lot more complex (and why do
    > you need that complexity on your first database?)

    But how much harder? (I know all of the babies in the MS Access forum I was in were crying at the thought of doing it this way, but I believe (??) it is how a lot of "enterprise" systems are built.

    Also, since outside of MS Access you don't have to deal with those hideous "bound" forms, it shouldn't take that much more code. I suppose you would just use two (vs. one) SQL INSERT statements, right?



    >#3 is called EAV (entity-attribute-value) and you should avoid it, full stop

    Ah, a name for this approach!

    Can you tell me more about this approach and why you say to avoid it like the plague??

    (BTW, this came from a very pompus MS Access know-it-all...)

    Like I said, while it "logically" makes sense, it doesn't seem very practical in terms of implementation. (A rather over-simplified and "academic" approach if you ask me?!)


    > my advice is to build a single table and load it with your data

    Okay, one vote for that, but I'd still like to discuss Scenario #2 more.



    > that experience alone is worth the effort

    Well, this was my first POST, not my first DB! *LOL*


    > feel free to come back and ask more questions if you need to

    Thanks!


    > by the way, i don't do oracle

    Hey, as long as you know your stuff - and aren't pretentious - that's fine by me!



    Just Bob

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Bob Just
    ... it would get sloppy over time.

    ... that is A LOT of empty fields.
    i ask again, what precisely is wrong with NULL columns (not "empty fields" please)

    Quote Originally Posted by Bob Just
    But how much harder?
    only one way to find out, eh

    Quote Originally Posted by Bob Just
    Can you tell me more about this approach and why you say to avoid it like the plague??
    sure:


    Quote Originally Posted by Bob Just
    (BTW, this came from a very pompus MS Access know-it-all...)
    oh, i desparately want to make a comment about EAV proponents, but i dasn't

    Quote Originally Posted by Bob Just
    I'd still like to discuss Scenario #2 more.
    sure -- why don't we do that right after you've created your scenario #2 tables and loaded them with a good representation (not just a few rows) of live data


    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jul 2008
    Posts
    32
    r937, you are my hero!!

    Thanks for the AWESOME reply!!

    Please give me some time to do my "due diligence" and read all your wonderful links before I respond.

    On a side note...

    Because I have become SO disgusted with MS Access, MS Access (pompous) users, MS Windows, and all of my Microsoft woes as of late, not only did I come here looking for "enlightenment", but I am also out the door at the moment to go look at an Apple MacBook!

    I'll be back in touch sometime later today or tonight.

    Thanks in advance,


    Just Bob

  9. #9
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    ms access, as an application platform, is hugely microsoftish

    but one thing it does very, very well is the graphical query builder, which i love, and which makes the pain of working with access somewhat bearable
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  10. #10
    Join Date
    Jul 2008
    Posts
    3
    Bob, old buddy. I just could NOT resist dropping in to say "hello".

    You didn't like the answers I gave you on "another" forum, because I refused to say that your poor table design is okay just because you WANT it to be okay.

    I'll leave you to the tender mercies of the Oracle guys and gals, who will tell you that one wide table with "lots of empty cells" is okay (though if you are going to go in that direction, you really ought to get a handle on the concept of Null in that regard, as one of them suggested). I think you called that design a "data warehouse" in your posts in the other forum, didn't you? Or maybe you meant something else?

    One thing, though, you should be aware that bad-mouthing the people who tried to help you surely doesn't make YOU look like a real pro, now does it? Of course, I AM assuming you want to be considered a real pro.....

    George
    Last edited by GroverParkGeorge; 07-16-08 at 16:09.

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    dear groverparkgeorge

    your first post makes you look like a stalker

    i am one of those guys who sees nothing wrong with NULLs

    i use the phrase "lots of empty cells" when it fits the conversation

    surely you're not one of these anti-NULL fanatics?

    you'll ahve a wonderful time here

    welcome to dbforums

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Jul 2008
    Posts
    3
    Perhaps it escaped your notice that our friend Bob chose to express his "question" in less than flattering terms about the folks who don't agree with him.

    Perhaps he never intended me to see those words, but the fact of the matter is that they did come to my attention. Call it stalking if you will (and you obviously will ), but when someone chooses to bad-mouth others because of a difference of opinion, well, I consider that to be less than professional and I don't see any moral high ground to be gained by ignoring it.

    Really, I just wanted to let Bob know I had become aware of his choice of words to describe those who won't support his view of the ways things should be. I would also like to wish him well as he moves on to more cordial climes. I'm sure he'll find kindred souls here. You seem to be one.

    By the way, when I discuss database issues with my peers, I try to avoid terms like "fanatic", "dogma" and so on. Loaded terms like that seem to reflect a religious approach that doesn't allow for intelligent discourse or honest disagreement.

    I am of the opinion that, all else being equal, Nulls in a table are not desirable. You are free to disagree, but please explain the rationale for your position, if you can, without resorting to name-calling. It makes for a more enlightening discussion than charges of "fanaticism".


    George
    Last edited by GroverParkGeorge; 07-16-08 at 20:24.

  13. #13
    Join Date
    Jul 2008
    Posts
    32
    Quote Originally Posted by GroverParkGeorge
    Bob, old buddy. I just could NOT resist dropping in to say "hello".

    You didn't like the answers I gave you on "another" forum, because I refused to say that your poor table design is okay just because you WANT it to be okay.
    My primary problem is that you read about 50% of what people type because you are so busy formulating a tome response to prove how "right" you really are. (Should I provide everyone a link?)

    Poor listening skills are a primary reason for failure in any human relationship - especially online. You should do some self-reflection on this one...



    Quote Originally Posted by GroverParkGeorge
    I'll leave you to the tender mercies of the Oracle guys and gals, who will tell you that one wide table with "lots of empty cells" is okay (though if you are going to go in that direction, you really ought to get a handle on the concept of Null in that regard, as one of them suggested).
    I understand the concept of Null, and anything that I, or others, may have said that wasn't worded properly was more out of laziness than understanding.



    Quote Originally Posted by GroverParkGeorge
    One thing, though, you should be aware that bad-mouthing the people who tried to help you surely doesn't make YOU look like a real pro, now does it? Of course, I AM assuming you want to be considered a real pro.....

    George
    Not that this will ever sink in, George, but anyone that has the TIME or MOTIVATION to somehow scan the Internet for a post made by me on another website is disturbing in the least.

    And, for the record, I never bad-mouthed you or your resident website by name, although I certainly made a comment about whinny Access people...

    How you found this post is beyond me?! (Unless you are searching the Internet for my e-mail, or you, or someone else is spoofing their identity.)

    Regardless, having a father who is a psychologist, I know that he would say that whoever YOU are, you have a VERY UNHEALTHY obsession with me and any prior conversations we may have had...



    Just Bob

  14. #14
    Join Date
    Jul 2008
    Posts
    32
    Rudy,

    Okay, finally time to respond!

    Quote Originally Posted by r937
    i ask again, what precisely is wrong with NULL columns (not "empty fields" please)
    I was just using colloquial terms...


    Quote Originally Posted by r937
    only one way to find out, eh
    That is why I am here!


    Quote Originally Posted by r937

    Very informative links!!

    Being "well-rounded", I can see the benefits of each way, although I generally think EAV is a bad idea. (When it was proposed to me by another, I immediately thought of "MUCK" table!! *LOL*


    Quote Originally Posted by r937
    sure -- why don't we do that right after you've created your scenario #2 tables and loaded them with a good representation (not just a few rows) of live data
    Well, how about I explain how I believe it would be done?!

    Here is my best guess...

    Let's say I have...

    tblExpense
    tblFuel
    tblElectric
    tblNaturalGas

    which all contain "Expense-level" data, but where tblExpense is my "super-type" and the others are "sub-type" tables.

    The user ("me") decides to enter a Gas Receipt.

    The user goes to the "Expense Entry Form" and enters...

    ExpenseID *system generated unique number
    ExpenseDate
    MerchantName

    ServiceType
    PumpNo
    FuelType
    TotalGallons
    CostPerGallon
    PaymentType
    CCName
    CCLast4
    OrderTotal


    and then clicks "Submit".


    At this point, I would use one SQL INSERT statement to write the BLUE fields to tblExpense and then a second SQL INSERT statement to write the RED fields to tblExpenseDetail, right??

    (This would be opposed to just using one INSERT into one table in my Scenario #1.)

    I am still leaning towards Scenario #2. (Scenario #1 seems to amateurish, and I personally find Scenario #3 to also be lazy and a lot of work.)

    Sincerely,



    Just Bob

  15. #15
    Join Date
    Jul 2008
    Posts
    32
    Quote Originally Posted by GroverParkGeorge
    Perhaps it escaped your notice that our friend Bob chose to express his "question" in less than flattering terms about the folks who don't agree with him.
    Perhaps because 1.) You don't read half of what others type so that makes your "solutions" often irrelevant, and 2.) You knowingly (or not) talk to people in a very condescending manner...

    That is the reason for any jaded comments I made - not because you didn't agree with me.


    Quote Originally Posted by GroverParkGeorge
    Perhaps he never intended me to see those words, but the fact of the matter is that they did come to my attention. Call it stalking if you will (and you obviously will )
    Definitely stalking, especially since I never called out your name or the forum we talked on.

    In fact, prior to this conversation, [you couldn't even prove I was talking about you or the forum you hang out in![/b]


    Quote Originally Posted by GroverParkGeorge
    but when someone chooses to bad-mouth others because of a difference of opinion, well, I consider that to be less than professional and I don't see any moral high ground to be gained by ignoring it.
    Stop talking down to people, George, and you'll create a lot less animosity towards yourself in life.


    Quote Originally Posted by GroverParkGeorge
    Really, I just wanted to let Bob know I had become aware of his choice of words to describe those who won't support his view of the ways things should be. I would also like to wish him well as he moves on to more cordial climes. I'm sure he'll find kindred souls here. You seem to be one.
    There. You are doing it again. You are now not only talking down to me, but to this entire Oracle forum.

    Maybe you should read the links that r937 posted above, and numerous other topics on the downfalls of the EAV model.

    Prior to coming here, I have known of numerous people who are many times more experienced and knowledgeable than you, George, that would be more likely to side with my choices on this debate than yours...

    The good news is, "there is more than one way to skin a cat". However, you seem to have this need to constantly be right. Why is that? Have you ever been wrong?

    One reason I am smart is because I can learn from others, whether we agree or not.

    George, are you saying that you know more about Database Modeling and Databases in general than r937?? Than everyone else here??

    It sounds like that to me...


    Quote Originally Posted by GroverParkGeorge
    By the way, when I discuss database issues with my peers, I try to avoid terms like "fanatic", "dogma" and so on. Loaded terms like that seem to reflect a religious approach that doesn't allow for intelligent discourse or honest disagreement.
    So instead you stalk people, and make unsolicited comments?


    Quote Originally Posted by GroverParkGeorge
    I am of the opinion that, all else being equal, Nulls in a table are not desirable. You are free to disagree, but please explain the rationale for your position, if you can, without resorting to name-calling. It makes for a more enlightening discussion than charges of "fanaticism".
    George
    I never said I am a fan of Nulls either. In fact, that is why I am skeptical of r937's love of my Scenario #1.

    My money is still on Scenario #2, although it is obviously not "perfect", but then again, I'm just a mere mortal...


    Just Bob

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •