If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Newbie Database Design Question, Can the database structure constantly change?

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 03-05-10, 11:33
temitayo_g temitayo_g is offline
Registered User
 
Join Date: Mar 2010
Posts: 3
Question Newbie Database Design Question, Can the database structure constantly change?

Hi everyone!
I have what might seem like a simple/stupid question, but I'm new so please bear with me.
I'm trying to build a products database with the following relationships for a web application. Each product will have certain features that can change fairly rapidly and I'm wondering how to store the features information. Basically I have two ideas as shown below:

IDEA ONE

Tables:
  • Products(Product_ID[PK], Product_Name, Product_Description)
  • Features(Feature_ID[PK], Feature_Unit)
  • Product_Features(Product_Feature_ID, Product_ID[FK], Feature_ID[FK], Product_Features_Max, Product_Features_Min

In this setup we have one Product_Features table with all data in it which means if I want to find all products with two features I have to do a join as follows (note this may be erroneous thinking on my part):
SELECT Product_ID FROM Product_Features
WHERE
Product_ID IN (SELECT Product_ID FROM Product_Features WHERE Feature_ID = 1 AND Product_Features_Min > 1 )
AND
Product_ID IN (SELECT Product_ID FROM Product_Features WHERE Feature_ID = 2)

Disadvantages:
This can get really redundant since some of the products will have up 15 features and we might want to search on all of them. It just seems like bad planning to have to join a table to itself at the design phase.
It also gets tricky if a feature has anything other than numerical values. For example if a product is available in several colors.


IDEA TWO


Tables:
  • Products(Product_ID[PK], Product_Name, Product_Description, Product_Archived)
  • Features(Feature_ID[PK], Feature_Unit, Feature_Description)
  • Product_Features(Product_Feature_ID, Product_ID[FK], Feature_1, Feature_2, Feature_3, etc (Not sure if this is necessary, but hold information as to whether a certain product has a feature with only yes or no values.
  • ..... (Lots of different individual feature columns)

In this set up we break each feature into its own table for example, the Product_Color table would look like this (comma delimited, bold=column names, ends with ):
Product_ID[FK], Product_Red, Product_Black, Product_White
1, Y, Y, N
2, N, Y, Y
... etc

Disadvantages: If the features are changing fairly regularly the database structure would also be constantly changing (which I'm told is bad).

QUESTIONS: Performance will also be very important, and I really don't know which would do better performance wise.

NOTES: I honestly prefer option 2, it seems cleaner and more efficient to me, but I really want it to be right. I learned most of what I know about databases and design while I worked for a company that had a horrible data model (eg there were tables that had columns called date_added, date_modified and date_deleted that would all update when a record was modified, empty code tables, and absolutely no documentation -- most of the time to find data you had to use the testing application to reverse engineer data ). Since working there I am now terrified that I will fall into the same mistakes that they did in order to get things working.

I'm completely open to any suggestions on how it should be set up!

THANKS FOR ALL YOUR HELP!!!
Reply With Quote
  #2 (permalink)  
Old 03-05-10, 11:54
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,085
idea two is far less attractive because (a) it violates the spirit of first normal form, and (b) it requires an ALTER TABLE when you need to add a new feature

idea one is right way, but you don't need a join, nor multiple IN subqueries
Code:
SELECT Products.Product_Name
     , Products.Product_Description 
  FROM Products
INNER
  JOIN ( SELECT Product_ID 
           FROM Product_Features 
          WHERE Feature_ID IN ( 1 , 2 ) /* list of feature ids */
         GROUP
             BY Product_ID
         HAVING COUNT(*) = 2 /* number of required features */
       ) AS only_these
    ON only_these.Product_ID = Products.Product_ID
this approach also allows you to select products which have a minimum number of features... for example, the IN list could contain 6 feature ids, and the HAVING clause could specify COUNT(*) >= 4

you can't do that easily at all with joins
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 03-05-10, 12:14
temitayo_g temitayo_g is offline
Registered User
 
Join Date: Mar 2010
Posts: 3
Quote:
Originally Posted by r937 View Post
idea two is far less attractive because (a) it violates the spirit of first normal form, and (b) it requires an ALTER TABLE when you need to add a new feature

idea one is right way, but you don't need a join, nor multiple IN subqueries
Code:
SELECT Products.Product_Name
     , Products.Product_Description 
  FROM Products
INNER
  JOIN ( SELECT Product_ID 
           FROM Product_Features 
          WHERE Feature_ID IN ( 1 , 2 ) /* list of feature ids */
         GROUP
             BY Product_ID
         HAVING COUNT(*) = 2 /* number of required features */
       ) AS only_these
    ON only_these.Product_ID = Products.Product_ID
this approach also allows you to select products which have a minimum number of features... for example, the IN list could contain 6 feature ids, and the HAVING clause could specify COUNT(*) >= 4

you can't do that easily at all with joins
Thank you so much!!! This makes sense.

I have one more question, how do I deal with features that are not numerical, such as color? Would I have to add a row for every color in the features table?
I guess i could add a feature group column which will allow me to group feature id's by type such as color, style etc?

Last edited by temitayo_g; 03-05-10 at 12:17. Reason: an idea strcuck
Reply With Quote
  #4 (permalink)  
Old 03-05-10, 12:20
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,085
Quote:
Originally Posted by temitayo_g View Post
how do I deal with features that are not numerical, such as color? Would I have to add a row for every color in the features table?
that's correct

remember, your "feature_id" can represent any type of feature
Quote:
Features(Feature_ID[PK], Feature_Unit)
clearly, Feature_Unit would have to be VARCHAR to hold colour names
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 03-05-10, 16:29
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Quote:
Originally Posted by temitayo_g View Post
I'm completely open to any suggestions on how it should be set up!
Depends. I'd like to know what "features" are in this model. Are they really a collection of the same type of attribute, same data types, same constraints and logic that apply in each case? If so, then option 1 looks like it could be on the right track. Not sure why the min and max columns are necessary though.

If however the features are truly different things with different data types or constraints or other logic then they are truly different attributes of a product. In that case option 2 looks much better and I would try to avoid option 1.

In either case, don't be afraid of change because change is good. If a business isn't changing then it is dying. Data models are a representation of some portion of the business and therefore it is natural that data models must change too. The only question is how you manage change, not how to avoid it.
Reply With Quote
  #6 (permalink)  
Old 03-05-10, 18:41
temitayo_g temitayo_g is offline
Registered User
 
Join Date: Mar 2010
Posts: 3
Quote:
Originally Posted by dportas View Post
Depends. I'd like to know what "features" are in this model. Are they really a collection of the same type of attribute, same data types, same constraints and logic that apply in each case? If so, then option 1 looks like it could be on the right track. Not sure why the min and max columns are necessary though.
The reason that I preferred option 2, was because not all features will have the same data types. I didn't post the specifics is because of the nature of the products and I was afraid of getting banned. Here is an example that hopefully wont offend anyone.
One feature is speed, not all products will have a speed setting, some will only have one speed and others may have 3. However on some products the speed functions are customizable and the min and max are to reflect that. For Example
If a product has a max feature speed of 0 then it doesnt have a speed feature
If has its min and max equal and not equal to 0 then speed is not customizable
If it has its max > the min then it is customizable
(guess there could be a Product_Customizable table that could store that information)

On the other hand the unit, Max and Min mean nothing to a feature like color. In that case there would have to be a different feature id for every possible color. (note: I'm thinking that we would have to have Feature_Group Table and add a Feature_Group_ID to the features table that way we can always tell when a feature has a group associated with it. Which may solve the speed type problem)

I hope I'm making sense. I'm getting to the point where I'm thinking aloud but it helps to have a sounding board.
THANKS SO MUCH FOR THE HELP!
Reply With Quote
Reply

Tags
database design, normalisation, tables

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On