| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

03-05-10, 10:33
|
|
Registered User
|
|
Join Date: Mar 2010
Posts: 3
|
|
Newbie Database Design Question, Can the database structure constantly change?
|
|
Hi everyone!
I have what might seem like a simple/stupid question, but I'm new so please bear with me.
I'm trying to build a products database with the following relationships for a web application. Each product will have certain features that can change fairly rapidly and I'm wondering how to store the features information. Basically I have two ideas as shown below:
IDEA ONE
Tables: - Products(Product_ID[PK], Product_Name, Product_Description)
- Features(Feature_ID[PK], Feature_Unit)
- Product_Features(Product_Feature_ID, Product_ID[FK], Feature_ID[FK], Product_Features_Max, Product_Features_Min
In this setup we have one Product_Features table with all data in it which means if I want to find all products with two features I have to do a join as follows (note this may be erroneous thinking on my part):
SELECT Product_ID FROM Product_Features
WHERE Product_ID IN (SELECT Product_ID FROM Product_Features WHERE Feature_ID = 1 AND Product_Features_Min > 1 ) AND Product_ID IN (SELECT Product_ID FROM Product_Features WHERE Feature_ID = 2)
Disadvantages: This can get really redundant since some of the products will have up 15 features and we might want to search on all of them. It just seems like bad planning to have to join a table to itself at the design phase.
It also gets tricky if a feature has anything other than numerical values. For example if a product is available in several colors.
IDEA TWO
Tables: - Products(Product_ID[PK], Product_Name, Product_Description, Product_Archived)
- Features(Feature_ID[PK], Feature_Unit, Feature_Description)
- Product_Features(Product_Feature_ID, Product_ID[FK], Feature_1, Feature_2, Feature_3, etc (Not sure if this is necessary, but hold information as to whether a certain product has a feature with only yes or no values.
- ..... (Lots of different individual feature columns)
In this set up we break each feature into its own table for example, the Product_Color table would look like this (comma delimited, bold=column names, ends with ):
Product_ID[FK], Product_Red, Product_Black, Product_White
1, Y, Y, N
2, N, Y, Y
... etc
Disadvantages: If the features are changing fairly regularly the database structure would also be constantly changing (which I'm told is bad).
QUESTIONS: Performance will also be very important, and I really don't know which would do better performance wise.
NOTES: I honestly prefer option 2, it seems cleaner and more efficient to me, but I really want it to be right. I learned most of what I know about databases and design while I worked for a company that had a horrible data model (eg there were tables that had columns called date_added, date_modified and date_deleted that would all update when a record was modified, empty code tables, and absolutely no documentation -- most of the time to find data you had to use the testing application to reverse engineer data  ). Since working there I am now terrified that I will fall into the same mistakes that they did in order to get things working.
I'm completely open to any suggestions on how it should be set up!
THANKS FOR ALL YOUR HELP!!!
|
|

03-05-10, 10:54
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,516
|
|
idea two is far less attractive because (a) it violates the spirit of first normal form, and (b) it requires an ALTER TABLE when you need to add a new feature
idea one is right way, but you don't need a join, nor multiple IN subqueries
Code:
SELECT Products.Product_Name
, Products.Product_Description
FROM Products
INNER
JOIN ( SELECT Product_ID
FROM Product_Features
WHERE Feature_ID IN ( 1 , 2 ) /* list of feature ids */
GROUP
BY Product_ID
HAVING COUNT(*) = 2 /* number of required features */
) AS only_these
ON only_these.Product_ID = Products.Product_ID
this approach also allows you to select products which have a minimum number of features... for example, the IN list could contain 6 feature ids, and the HAVING clause could specify COUNT(*) >= 4
you can't do that easily at all with joins 
|
|

03-05-10, 11:14
|
|
Registered User
|
|
Join Date: Mar 2010
Posts: 3
|
|
|
|
Quote:
Originally Posted by r937
idea two is far less attractive because (a) it violates the spirit of first normal form, and (b) it requires an ALTER TABLE when you need to add a new feature
idea one is right way, but you don't need a join, nor multiple IN subqueries
Code:
SELECT Products.Product_Name
, Products.Product_Description
FROM Products
INNER
JOIN ( SELECT Product_ID
FROM Product_Features
WHERE Feature_ID IN ( 1 , 2 ) /* list of feature ids */
GROUP
BY Product_ID
HAVING COUNT(*) = 2 /* number of required features */
) AS only_these
ON only_these.Product_ID = Products.Product_ID
this approach also allows you to select products which have a minimum number of features... for example, the IN list could contain 6 feature ids, and the HAVING clause could specify COUNT(*) >= 4
you can't do that easily at all with joins 
|
Thank you so much!!! This makes sense.
I have one more question, how do I deal with features that are not numerical, such as color? Would I have to add a row for every color in the features table?
I guess i could add a feature group column which will allow me to group feature id's by type such as color, style etc?
|
Last edited by temitayo_g; 03-05-10 at 11:17.
Reason: an idea strcuck
|

03-05-10, 11:20
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,516
|
|
Quote:
Originally Posted by temitayo_g
how do I deal with features that are not numerical, such as color? Would I have to add a row for every color in the features table?
|
that's correct
remember, your "feature_id" can represent any type of feature
Quote:
|
Features(Feature_ID[PK], Feature_Unit)
|
clearly, Feature_Unit would have to be VARCHAR to hold colour names
|
|

03-05-10, 15:29
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
Quote:
Originally Posted by temitayo_g
I'm completely open to any suggestions on how it should be set up!
|
Depends. I'd like to know what "features" are in this model. Are they really a collection of the same type of attribute, same data types, same constraints and logic that apply in each case? If so, then option 1 looks like it could be on the right track. Not sure why the min and max columns are necessary though.
If however the features are truly different things with different data types or constraints or other logic then they are truly different attributes of a product. In that case option 2 looks much better and I would try to avoid option 1.
In either case, don't be afraid of change because change is good. If a business isn't changing then it is dying. Data models are a representation of some portion of the business and therefore it is natural that data models must change too. The only question is how you manage change, not how to avoid it.
|
|

03-05-10, 17:41
|
|
Registered User
|
|
Join Date: Mar 2010
Posts: 3
|
|
Quote:
Originally Posted by dportas
Depends. I'd like to know what "features" are in this model. Are they really a collection of the same type of attribute, same data types, same constraints and logic that apply in each case? If so, then option 1 looks like it could be on the right track. Not sure why the min and max columns are necessary though.
|
The reason that I preferred option 2, was because not all features will have the same data types. I didn't post the specifics is because of the nature of the products and I was afraid of getting banned. Here is an example that hopefully wont offend anyone.
One feature is speed, not all products will have a speed setting, some will only have one speed and others may have 3. However on some products the speed functions are customizable and the min and max are to reflect that. For Example
If a product has a max feature speed of 0 then it doesnt have a speed feature
If has its min and max equal and not equal to 0 then speed is not customizable
If it has its max > the min then it is customizable
(guess there could be a Product_Customizable table that could store that information)
On the other hand the unit, Max and Min mean nothing to a feature like color. In that case there would have to be a different feature id for every possible color. (note: I'm thinking that we would have to have Feature_Group Table and add a Feature_Group_ID to the features table that way we can always tell when a feature has a group associated with it. Which may solve the speed type problem)
I hope I'm making sense. I'm getting to the point where I'm thinking aloud but it helps to have a sounding board.
THANKS SO MUCH FOR THE HELP!
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|