Results 1 to 7 of 7
  1. #1
    Join Date
    Dec 2010
    Posts
    2

    Database design help

    One of my employees is suggesting a DB schema for one of our new projects and I have some reservations about it.
    I am going to take an example to describe the tables we are setting up in our DB
    EmpName----Dept----Status----Experience
    Charles-------3--------4-----------6
    Stacy--------3--------4-----------7
    Sarah--------1--------5-----------7
    Andrew------2--------4-----------7

    Attribute----Selection-------Value
    Dept---------Sales------------1
    Dept---------Legal------------2
    Dept---------Accounting------3
    Status-------Active-----------4
    Status-------Not Active-------5
    Experience---Beginner---------6
    Experience---Expert-----------7

    The first table says Charles Dept = 3 which means looking at the second table Charles’ Department = Accounting. Pretty straight forward. In this new database we will have about 10-15 tables similar to table1 and 1 table similar to table2 which will hold the attributes of all the other tables in the DB. Each of these tables are expected to be big. About 1000+ records each.

    My first problem with this structure is that the first table above is not very readable. At one glance I cannot look at table1 and tell what Department Charles works in. I would have to either refer to table2 or write a join query between the two tables.

    The second problem I have is that my application developers who are writing the VB source code are complaining that this is adding overhead on them when they have to write something back to table1. So for example if Charles’ status changes from Active to Not Active, in the VB source instead of writing a simple query that changes Charles’ status from Active to NotActive, now they have to reference table2 everytime they have to write anything to table1.

    The DBA who is suggesting this schema is saying that writing the actual words “Sales, Accounting, Beginner, Expert… etc” will cause the database to run slower since there are so many records in these 10-15 tables. And also this would break database design disciplines.

    I am a new team lead and not really sure how to resolve this issue. Any help/insight is much appreciated.
    Last edited by donjt81; 12-04-10 at 19:53.

  2. #2
    Join Date
    Nov 2003
    Posts
    2,933
    Quote Originally Posted by donjt81 View Post
    One of my employees is suggesting a DB schema for one of our new projects and I have some reservations about it.
    I am going to take an example to describe the tables we are setting up in our DB
    EmpName----Dept----Status----Experience
    Charles-------3--------4-----------6
    Stacy--------3--------4-----------7
    Sarah--------1--------5-----------7
    Andrew------2--------4-----------7

    Attribute----Selection-------Value
    Dept---------Sales------------1
    Dept---------Legal------------2
    Dept---------Accounting------3
    Status-------Active-----------4
    Status-------Not Active-------5
    Experience---Beginner---------6
    Experience---Expert-----------7
    Having everything in the "attributes" table is asking for trouble. This is an anti-pattern called "one true lookup table" and one of the biggest disadvantages is, that you can't have any constraints with this schema.

    The DB will not prevent writing the value 7 into the dept column.

    A properly normalized solution would look something like this (assuming each employee works in exactly one department, has exactly one status and exactly one value for experience)

    employee (id, name, dept_id, status_id, experience_id)
    emp_staus (id, status_name)
    department (id, dept_name)
    emp_experience (id, experience)

    With this, you can have proper foreign keys, which prevent to reference the wrong lookup table.

    Each of these tables are expected to be big. About 1000+ records each.
    1000 Rows is peanuts and does not qualify at all to be called "big" (100 million rows start to qualify as "big")

    My first problem with this structure is that the first table above is not very readable. At one glance I cannot look at table1 and tell what Department Charles works in.
    So what?
    I would have to either refer to table2 or write a join query between the two tables.
    That's how a relational database works.
    What you are proposing is a spreadsheet, not a relational database.

    The second problem I have is that my application developers who are writing the VB source code are complaining that this is adding overhead on them when they have to write something back to table1. So for example if Charles’ status changes from Active to Not Active, in the VB source instead of writing a simple query that changes Charles’ status from Active to NotActive, now they have to reference table2 everytime they have to write anything to table1.
    Then you should send your developers to a database training.

  3. #3
    Join Date
    Dec 2010
    Posts
    2
    emp_staus (id, status_name)
    department (id, dept_name)
    emp_experience (id, experience)
    But if i combine all the attributes in all of my tables, there are total about 15 - 20 attributes. (When I say attribute I mean things like emp_status, department, emp_experience etc.) So that means I would create a table for each attribute. which means 15 - 20 of these tables. Would this be the right way to design the database?

  4. #4
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Not all of them should be split up in separate tables. You mention emp_status, that should be a simple flag of the sort "A"ctive/"I"nactive that can go in to your employee table. Date of Birth, etc... are all attributes that belong to a single person. You mention Department, that is its own entity thus another table. This other table denotes some type of ID to simply identify that dept in other tables, then a description of some sort along with the name of the department.
    As another example, I deal with medical claims. On the claims database, I only store the medical recipient's ID number, I do not store all info about that person on each medical claim. When I need to show a claim and the person's information, I join the claims tables with the recipient table(s) to show the required information.
    In other words, myself and shammat are saying to listen to your DBA. You did after all hire him to manage your database, did you not?
    Dave Nance

  5. #5
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    I would strongly suggest that you take a look at Paul Litwin's article, "Fundamentals of Relational Database Design." for a good coverage of the issues. There are good reasons why those "database design disciplines" were developed in the first place... Do a google search for "Database Normalization."

    Like shammat, I would agree that placing all the attribute data in a single table is probably not the way to go either, primarily because you cannot allow the database to enforce the data constraints. This approach is known as the Entity-Attribute-Value (EAV) model, and it requires that the front end application(s) know details about the data structure, and enforce constraints on the data.
    Last edited by loquin; 12-06-10 at 17:19.
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  6. #6
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Quote Originally Posted by shammat View Post
    This is an anti-pattern called "one true lookup table" and one of the biggest disadvantages is, that you can't have any constraints with this schema.

    The DB will not prevent writing the value 7 into the dept column.
    To be fair, you CAN implement relational integrity in such a schema, though it is a bit awkward.
    From my personal perspective, the biggest problem with the OLTP ("One True Lookup Table") is...
    There Is No Such Thing As A Lookup Table.
    By which I mean that if you decide later to start storing additional attributes about these categories (DepartmentType, StatusGroup, StatusIsActive, ExperienceIndustry, etc) you can't. At least not without creating those attributes for each category.
    Not to mention the performance issues and coding headaches that crop up in OLTP.
    Use separate tables for each of these reference types.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  7. #7
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    Also, take a look at Ten Common Database Design Mistakes. I do like Davidson's line
    Quote Originally Posted by LouisDavidson
    Normalization is not just some plot by database programmers to annoy application programmers (that is merely a satisfying side effect!)
    Last edited by loquin; 12-06-10 at 19:23.
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •