Thread: A question on Relational Model
10-11-04, 11:34 #1Registered User
- Join Date
- Aug 2003
Unanswered: A question on Relational Model
I couldn't quite get my head around the following.
"The use of bit fields departs from a normalized relational model, since one column represents more than one value".
I can't see how the use of bit field departs from the relational model. Relational database takes advantage of foreign key definitions to maximize storage efficiency. But how does using bit field violate that model? Using a bit field is no different to using a varchar2 field.
This made me question what exactly is the relational model? Anyone has the answer?
10-11-04, 12:36 #2Registered User
- Join Date
- Apr 2004
By "bit fields" they mean a field where each bit (or bytes) represents a different piece of data, and could be stored in different fields.
For example, a sales rep's territory is 01-34-67. The 01 represents is division, the 34 is the region within the division, and 67 the territory within the region. The territory could be stored as a varchar2 "01-34-67", or stored as a number (or varchar) where each value is stored as an ascii value in a byte of a 2 byte field.
For relational design, you would create 3 fields for this data. For better relational design, you would create a hierarchy to connect these (since a reorg in the company could shift terr 67 into a different region), and map the display values (01, 34, 67) to non-intelligent pks (using sequences).Give a man a fish, you feed him for a day. Club him over the head with a fish, he'll leave you alone.
10-11-04, 12:51 #3Moderator.
Provided Answers: 1
- Join Date
- Sep 2002
You have bumped into the thorny issue for the relational model: what is an atomic value?
First Normal Form (1NF) says that each attribute (column) in a tuple (row) contains a single value from its domain. In fact, all tables are in 1NF by definition, since you cannot store more than 1 value in any column. However, what if the single value you store is composite, like shoblock's example, or like 'red,yellow,blue'?
Well, according to the relational model, that is fine: those values are atomic with respect to the database. However, that doesn't mean it is good practice: it isn't (perhaps) as clean and easy to work with as a design where those "atomic" values are further decomposed.
Consider a DATE column: everyone agrees that is atomic, and 1NF is not violated by using a DATE column, right? Yet someone else might say that the DATE could/should be decomposed into Year, Month, Day, Hour, Minute, Second for some purpose.
This has been discussed at length, and quite entertainingly, here!Tony Andrews
10-11-04, 14:45 #4Registered User
This has been discussed at length, and quite entertainingly, here!
- Join Date
- Aug 2004
10-11-04, 15:28 #5Registered User
- Join Date
- Apr 2004
I thought about using the date field as an example for that very reason. In Oracle, a date is stored using 7 bytes, each containing a different, distinct, portion of the date:
10/11/2004 14:00:24 Typ=12 Len=7: 120,104,10,11,15,1,25
1 row selected.
So, what defines "atomic" is up for argument, but the "JAN" part of a date is meaningless as a date without the rest of the field. If you're running quarterly reports, it's better to define the quarters as 1-4, and assign each a start and end date. The bad approach would be to store the value JAN (or APR, JUL or OCT) with each row to define the quarter. If you changed your fiscal calendar then all your data would be wrong. So, if the atomic element is usable by itself (the region code in my example), then store it as itself. But if you think "JAN" is usable by itself for reporting, then make a char field, don't store it as a date field. Once you create that multi-value field, someone is going to want to use the elements individually, and then you have other problems (field like '%blue%' is not as efficient as using a detail table where exists a blue row).
In the 'red,yellow,blue' example, each could be (and probably is) used independently, so they should not be stored as one field. Phone numbers are up for grabs - store the area code with or without the main number? Area codes change sometimes (very often in NJ), but the rest of the number doesn't help without the area code.
Rather than argue 1st, 2nd or 3rd normal form (as many people do, and many job interviews want explained), I find that a few simple rules work best: if it might ever, EVER change, put it in its own table with a sys-generated id to be stored elsewhere (people are always surprised to find out that social security numbers can change); if it can have more than one value then put it in a detail table because the number of allowed values might grow (except for 7 days a week and 12 months per year). Hey, there's always an exception to every rule.
I've seen systems where the employees dependents (kids) were stored as dep1, dep2... dep10 on the emp record - the first guy with 11 kids broke the system. Of course storing a persons address line in a seperate table from the city/state is overkill (most agree to that). Timesheet systems are easier to build when the table contains 7 columns for the daily hours. You might think that your employee number will never change, and is a great pk/fk, but when companies merges, these are all subject to change.Give a man a fish, you feed him for a day. Club him over the head with a fish, he'll leave you alone.