Before getting into too much detail I have a question about how to go about storing yearly data. I need to be able to go back in time and see a snapshot of the data at that time. I was using Access where table size was an issue, but now I'm going to use SQL Server 2005 Express where it is not, to an extent.
In Access I used separate tables for each year. It sounds like the best practice would be to combine the tables into one and add a year field which makes sense for my purchases table. However it doesn't seem so simple for my products table. Each year our product table changes, not by much, but a little none the less. If I go back to 2003 I need only products available that year to be listed. This is easy with separate tables and I guess would be easy too if I combined the product tables and used a year field.
My problem with this is there would be so much duplicate data except for the year field. For example Jim's Candy might be a staple in our product line and doesn't change. There would be several fields, mostly text, that wouldn't change. So for the next 10 years Jim's Candy will be in the table 10 times with nothing changing except the year field. To fix this I thought I'd just make a table that had the product ID and then a year field. It would still contain 10 entries for Jim's Candy, but only two fields instead all the fields in the product table. In reality this intermediate table would hold several other fields that could change on a yearly basis.
Does this sound like the best method? My overall problem is a bit more complicated, but I'd like to start here.
Rudy's right. As an alternative you could give your products a year range using a start year column and an end year column. You then select all the records where the start year is less than or equal to the snapshot year and where the end year is greater than or equal to the snapshot year.
Thanks for the replies. Certus I can't use your suggestion because it is not guarenteed that years a product is available will be contiguous.
So now for the complicated part. We have suppliers that make multiple products and some that only make one. We need to be able to view data based on the supplier and individual products both. So my thought is to have parent products and child product(s). There would be a parent table and a child table linked by an intermediate table. For each parent there must be at least one child product. Technically the parent would not be a product it would be the supplier. All the products would be in the child table. So using the 'years' table idea I figure I need four tables. One to hold the static info for the parents, one to hold the years and non-static info for the parents, one to hold the static info for the children, one to hold the 'years' and non-static info for the child. The parent would be linked to the child in the last table mentioned. I'll diagram the relationships below.
Please note I simplified the tables a bit as far as the amount of fields. Basically I'm looking for opinions. I have this setup and it works based on the little testing I've done. However now I have four tables replacing the one table per year used in the old structure. It makes queries a bit more complicated. But I can do other tasks much easier using the new structure. I can't/won't go back to the old method of one table per year so I guess I'm looking to see if there is a better method similar to the new structure above.
In this way a supplier can have multiple products and a product can have multiple suppliers.
In this way there is no redundancy in years, in products or in suppliers.
The work would then be not in the database, which is normalized, but in the application where you would have to come up with a simple means of maintaining the year, product and supplier tables and a simple means of relating years to products and products to suppliers. And that is not rocket science either.
Certus - thanks for the graphic. I believe what complicates this is that the supplier has data that will change from year to year. So the suppliers need their own years table. While looking at my original diagram I noticed that there will not be any non-static data in the products table other than the supplier ID. I've redone my diagram using your naming scheme.
Certus I think we are on the same page on the structure. Again the only difference is the years table which I'd like to expand upon.
You use a yearID where I just use the year itself. Is this merely conceptual on your part? The way I see it is that the yearID is a LONG INT and a year is only an INT or even could be a LONG INT. So as far as performance or structure goes there wouldn't be any significant difference would there? I mean everywhere you have a yearID I would have a physical year.
To add to this my purchase table, right now, has a field for a 4 character distributor code and a 10 character product code. This done because the text codes are easier to work with than ID numbers. Reports are entered into the DB using these codes. But really I should convert the code to the proper ID when entering it into purchase table right? Each distributor and product has an ID and a text code.
PS. thanks on the signature. I saw another member who had an if/else block and I thought it was funny. I've been using the Goto Hell in my code for a couple years.
On Error Goto Hell
Msgbox("An error occurred, but was sent to Hell. You may continue.")
gwgeller, I have a confession to make. I am partial to surrogate keys. What you are using are natural keys. A surrogate key is hidden by the application and never seen by the user. The application uses it to maintain referential integrity because the surrogate key values once assigned never change. In the case of Year, SupplierCode and ProductCode, you are dealing with natural keys. Natural keys are seen by the user and the user knows what those Codes represent. If a natural key used as a primary key and is changed you have to cascade the change to all the child tables that use the natural key as a foreign key. This is acceptable in most cases.
I should also point out that if you use a surrogate key you can still use the natural key, but you have to create a unique index on the natural key to protect against duplicates.
i would rather not post that flim-flammery with a surrogate key for the year, even if you think that surrogate keys are the cat's pajamas, because the recipient of your largess isn't up to speed enough to determine that you are actually making a sly joke
be clear, be open, be frank, and please don't use a surrogate key for the year