Results 1 to 4 of 4
  1. #1
    Join Date
    Aug 2004
    Location
    Cary, NC
    Posts
    264

    Question Storing Multiple Values of Different Lengths

    Struggling with a database design to create specification sheets to send to our suppliers so they can provide us with the proper information. The basics are as follows:

    Specification sheet layout is specific to a product group or category. We currently have 20 product groups but expect this to grow in the near future.

    Specification sheet values are product specific even within one product group

    Every sheet layout shares the same basic fields then adds additional fields based on the need of that product. Some fields are required, some are optional.

    I need to be able to send to the supplier a single view of the spec sheet layout for a particular product along with the field information (size and required/optional).

    I also need to be able to store the field values that I receive back from our suppliers and then pull these up in a spec sheet view.

    My Current Design:

    The existing tables I can't change are:
    Product, Category, ProductCategory
    (one product can belong to more than one category and categories have mulitple products. Each table has an unique index field)

    Then to handle the spec sheets I was looking to add the following tables (Indexes are auto numbered)

    SpecFields: the available fields that a spec sheet can use.
    (FieldIndex,FieldName,FieldDescription,FieldSize)

    SpecSheet: the fields of a specSheet for a particular group
    (SpecSheetIndex,ProductCategory,FieldIndex,Require d)

    So far this works good. It defines how a spec sheet should look and I can view them by category.

    What I need the help on:
    (advice on the above design is welcome but it seems to be working!)

    I need to store the values for the specFields specific to the part. The data is coming in as a SpecSheet which includes the field index. So my original thought was the following:

    SpecValues: the actual values for a spec sheet specific to one part
    (SpecValueIndex, ProductIndex, FieldIndex, FieldValue)

    SpecValueIndex is auto as mentioned above. The combination of ProductIndex and FieldIndex would be unique.

    This seemed to work well in theory until I realized that I had one table field that was expected to hold values of many different sizes and data types.

    So my question is:


    How can I best store this data?
    As I mentioned above we currently have 20 product categories (which means 20 or more specSheet layouts) but will have many more.

    Any particular spec sheet will have between 19 and 48 fields. (The first 19 are always the same) Most are text fields ranging from a two character code up to a 250 character comment field. Some of the values are numeric requiring up to 4 decimal places.

    There are currently 284 possible fields that a spec sheet can use.

    The only designs I can think of are the one I described here which has the problem of the various data types and a table design with 284 fields in it that would be mostly null. Neither of these designs seem good to me.

    I have a second storage need that is very similar to this but has the same problem. I need to store up to 5 different descriptions per product and each description has a particular length that it must stay within. I figure if I can solve the more complicated version I can apply the concepts to the descriptions issue.

    Any ideas or advice?

    Steve.

    Steve

  2. #2
    Join Date
    Oct 2002
    Location
    Baghdad, Iraq
    Posts
    697
    More questions you need to answer:
    1. Will those fields change?
    2. What kinds of queries are you going to run on them? In particular, how will the data be presented?
    3. What kind of constraints do you need on the fields?
    4. Are all of these fields single-value fields? That is, are there any "list of x" fields lurking in there?
    5. Do you need to do calculations on any values of the fields?
    I understand why you want to do it that way... lots of fields in a table are ugly. But the entity/attribute value approach you tried just moves (and amplifies!) the ugliness out of the schema and into your queries and application code.

    Your approach seems attractive if the answer to #1 is "often" and #2 is "I'm going to dump the fields into an HTML table." But aside from the problems you've run into, it's particularly problematic for #3 if you have any complicated constraints, or #5 if you have calculations.

    One way of taming the beast is to break the fields up and splitting them off into a few tables based on what fields tend to be associated with each other. I suspect that if you have different types of spec sheets, there might be different reports that are more suited to one set of fields than another.

    But there's nothing inherently wrong with a big table with lots of fields. Some DBMSs don't handle it well, but that's the fault of the manufacturers, and I don't understand why DBAs will stick with a product that forces them to alter a perfectly good schema. If a C++ compiler vendor shipped a product that said "no more than 255 symbols per class" they'd go out of business!

  3. #3
    Join Date
    Aug 2004
    Location
    Cary, NC
    Posts
    264

    Angry

    Thanks for your response I appreciate the input.

    The questions you raised are very good and are making me think about this a bit more. My answers:

    1. Since the value is a particular attribute of a product once entered it should not change

    2. Queries are more in depth but not insane. Some that I know I will need to run are:
    • Does a spec sheet exist for a given product
    • Percentage of sheets for a given MFG or Group (using productIndex)
    • Percentage of required fields for a given sheet
    • Which required fields are still missing data
    And of course the previously mentioned need to show a view of one sheet in either a blank format or with values for one product
    3. Constraints would be nice but they are not essential. For now they just want to take the data provided from the supplier and assume it is valid. The most useful constraint would be required or optional

    4. Every value is a single value field for one particular product. The combination of productIndex and fieldIndex would be unique for the SpecValues table.

    5. At this point no calculations are planned. Although I suspect this is due to the fact that the information has been stored in many separate text files up to this point. Once the data is properly stored this may change. If calculations are needed they would be simple such as the average weight of all products in group Q.

    One way of taming the beast is to break the fields up and splitting them off into a few tables based on what fields tend to be associated with each other. I suspect that if you have different types of spec sheets, there might be different reports that are more suited to one set of fields than another.
    Could you elaborate on this a bit? The spec sheets are in groups but many fields are common among the sheets. The first 19 fields are always the same, the remaining combination of fields differ but weight may be used in 15 different product group spec sheets for example.

    I thought about splitting up the fields into tables such as theBaseFields and theOtherFields but it seemed like the queries required to join these back together for a particular product would be very cumbersome. BaseFields would have 19 rows for one part and OtherFields between 1 and 29 rows. So unless I am understanding this incorrectly it would require a union query just to determine if all the fields are filled out.

    One design that has been bounced around a bit has a table for every spec sheet format with a column for every attribute. Then a second table to define if these field values are required or optional for this spec sheet. But how do you connect the columns of a table into another table to show their required value? I know I can put database constraints on the fields but I need to include the required status of an attribute when requesting information on a part and if a part spec sheet arrives missing data we still want to enter what we have until we get the rest.

    So I end up running in circles yet again.

    Steve

  4. #4
    Join Date
    Oct 2002
    Location
    Baghdad, Iraq
    Posts
    697
    Quote Originally Posted by sps
    My answers:
    You know your project best, but I noticed "should not" and "would be nice" and "at this point." Myself, I prefer to err on the side of paranoia.

    Quote Originally Posted by sps
    Could you elaborate on this a bit? The spec sheets are in groups but many fields are common among the sheets. The first 19 fields are always the same, the remaining combination of fields differ but weight may be used in 15 different product group spec sheets for example.
    Two points to elaborate: if you split fields off into their own tables, consider making them not nullable, meaning that either all the fields in the record are used or none. (Some might still be nullable, depending on design.)

    Two: weight is used in 15 different sheets, but is it weighing the same thing? I know... if you were to be absolutely strict you might wind up with 15 different versions of weight. If it makes the meaning of the data clearer, consider doing it that way.

    Quote Originally Posted by sps
    I thought about splitting up the fields into tables such as theBaseFields and theOtherFields but it seemed like the queries required to join these back together for a particular product would be very cumbersome. BaseFields would have 19 rows for one part and OtherFields between 1 and 29 rows. So unless I am understanding this incorrectly it would require a union query just to determine if all the fields are filled out.
    It should just be a bunch of left joins. Also, I don't like "theBaseFields" and "theOtherFields." Tables should be composed of fields that are meaningfully related.

    Quote Originally Posted by sps
    One design that has been bounced around a bit has a table for every spec sheet format with a column for every attribute. Then a second table to define if these field values are required or optional for this spec sheet.

    But how do you connect the columns of a table into another table to show their required value? I know I can put database constraints on the fields but I need to include the required status of an attribute when requesting information on a part and if a part spec sheet arrives missing data we still want to enter what we have until we get the rest.
    If a value is optional, it should be marked as nullable in the system catalog. (Due to CREATE TABLE foo (field VARCHAR(42) NULL, field2 INT NOT NULL...)) Are you familiar with querying the database schema? If you need to, you can even make your own table and join it with the DBMS system tables. Whatever DBMS you use, there's inevitably a "table" column and a "column" column in the catalog, so you just make a table of your own with rows for each field and use inner join to piggyback your information on the system catalog. (Just don't go updating the system catalog that way... it's bad juju.)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •