Results 1 to 7 of 7
  1. #1
    Join Date
    Jan 2004
    Posts
    4

    Unanswered: To break or not to break tables

    We are taught to break tables into multiple tables to reduce redundancy. However, sometimes it's hard to know if you are really benefiting when you exercise that practice or not. Specifically, you are told to break down tables according to functional dependency.

    In looking at my table, I can't tell if I am saving any storage space by breaking it down. It has 4,200 records in it, with only 2,600 entries. 900 entries are dupes.

    When you split a table into two seperate tables, it is costing you four bytes per record in each table. In other words, the "link" fields will take four bytes whether you link or not.

    Does anyone have a method for deciding when to break their tables down further and when to leave them alone?

  2. #2
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713

    Re: To break or not to break tables

    Originally posted by tdion
    We are taught to break tables into multiple tables to reduce redundancy. However, sometimes it's hard to know if you are really benefiting when you exercise that practice or not. Specifically, you are told to break down tables according to functional dependency.

    In looking at my table, I can't tell if I am saving any storage space by breaking it down. It has 4,200 records in it, with only 2,600 entries. 900 entries are dupes.

    When you split a table into two seperate tables, it is costing you four bytes per record in each table. In other words, the "link" fields will take four bytes whether you link or not.

    Does anyone have a method for deciding when to break their tables down further and when to leave them alone?
    Yes. It's called "normalization" ... And there are 5 normal forms.

  3. #3
    Join Date
    Jan 2004
    Posts
    4
    Poobah,

    First, I know you earned your stars and must be considered an expert. Second, I know about normalization and have taken two database theory classes.

    To rephrase my question: Is it ever beneficial NOT to normalize? Consider instances where blank entries are still costing you the four byte link entries... Maybe I'm the only one that thinks things out that critically.

    In a table of 4,200 entries, with 2,000 empty records, Access still stores 8,000 bytes of "link" data no matter what you enter. If you leave the table alone, JET doesn't store anything for your blank entries.

    Thanks all for reading my post.

  4. #4
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713
    Originally posted by tdion
    Poobah,

    First, I know you earned your stars and must be considered an expert. Second, I know about normalization and have taken two database theory classes.

    To rephrase my question: Is it ever beneficial NOT to normalize? Consider instances where blank entries are still costing you the four byte link entries... Maybe I'm the only one that thinks things out that critically.

    In a table of 4,200 entries, with 2,000 empty records, Access still stores 8,000 bytes of "link" data no matter what you enter. If you leave the table alone, JET doesn't store anything for your blank entries.

    Thanks all for reading my post.
    I don't consider myself an expert ... It's just a status for how many questions answered (or asked) ... And to answer your question: Yes. Case in point: I have a table with a 4 char unit of measure. I have a table that stores this UOM. Now, I have an associated UOM ID of type LONG (4 bytes). Do I store the actual code or the ID? Does it matter? Yes. I store the actual eliminating an extra look up.

    As for your last point, space is cheap and a database need be only as "normal" as you need it. Besides, if you ask 5 different SQL gurus you'll get 5 different answers ... You do best as you see fit.

  5. #5
    Join Date
    Jan 2004
    Posts
    4
    Well, Pubah, that leads me to my final question....

    How do you decide the fine line between normalizing and not normalizing?

    What I have found to be true is that most of the time I know whether I should or shouldn't, but sometimes I don't.

    If I do normalize, the SQL queries become more complicated to write.

    If I don't, the database is bloating more than it should.

    Sometimes I can't tell if normalizing will reduce redundancy.

    That's my final answer.

  6. #6
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713
    Originally posted by tdion
    Well, Pubah, that leads me to my final question....

    How do you decide the fine line between normalizing and not normalizing?

    What I have found to be true is that most of the time I know whether I should or shouldn't, but sometimes I don't.

    If I do normalize, the SQL queries become more complicated to write.

    If I don't, the database is bloating more than it should.

    Sometimes I can't tell if normalizing will reduce redundancy.

    That's my final answer.
    Experience! Basically normalizing will either decrease the size of the DB (eliminating duplicate data) and/or it will increase search speed (via relationships and keys - primary and foreign). So, if normalizing will not do either of these why bother? Please note: How complex a query is to retrieve data is irrelevant in relation to how data is properly structured ...

    As for your last point, realize that normalizing data will always reduce/eliminate redundency ... Pretend that everything is in 1 flat file and normalize ... Boy, oh boy, that database shrinks rapidly. Think about this: For each line item of an order, if the customer info were duplicated on each line would that be a good thing? (rhetorical)

  7. #7
    Join Date
    Oct 2003
    Posts
    706
    [i]
    How do you decide the fine line between normalizing and not normalizing?
    If I do normalize, the SQL queries become more complicated to write... If I don't, the database is bloating more than it should... Sometimes I can't tell if normalizing will reduce redundancy.
    There are certainly reasons not to normalize your data. You touched on two of them. "Saving space," however, is generally not one of them. (The traditional tradeoff is "speed vs. space," and these days, "disks are big and fast and cheap." This was not always the case.)

    Another possible reason is that, "exceptions happen." One invoice might need to be mailed to a different address than usual, just this one time. A line-item description on that invoice might need to be edited, just for this invoice, without changing the status of the item itself. And so on and so forth... Flexibility is important to make sure that the system you design matches the real world. There are really no absolutes; not at all.

    Dogma is for churches and textbooks. Leave them there. Analyze your intended application, and any existing applications, to make sure that you can justify the decisions you have made, if only to yourself.
    ChimneySweep(R): fast, automatic
    table repair at a click of the
    mouse! http://www.sundialservices.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •