Results 1 to 5 of 5
  1. #1
    Join Date
    Oct 2003

    Smile please help - car database

    Hello everyone. This is my first post and I'm fairly new to database design, but I have spent some time trying to teach myself about normalization.

    I have a problem I was hoping someone could help me solve. It's probably easy for some of you, but since I'm just learning, it's giving me a huge headache.

    I'm creating a database of news articles related to cars. I'm trying to normalize the database as best I can but I can't figure out how I should create the table for cars.

    Here is what the database looks like:

    News: (newsid, date, carid)

    Car: (car id, year, make, model, trim)

    Since this is a many-to-many relationship, I will also create a table linking the news to many cars:

    Combined: (combinedid, Newsid, carid)


    Okay, here's my problem. It seems to me that I should be seperating the "car" table into seperate tables because if I don't, then there will be a lot of repeating information:

    Id: year: make: model: trim:
    001 2003 Honda Accord EX
    002 2003 Honda Accord ESX
    003 1999 Honda Accord E

    Etc, etc....

    So, should i seperate all these columns into a seperate table?
    For example:

    Year: (yearid, year)
    Make: (makeid, make)
    Model: (modelid, model)
    Trim: (trimid, trim)

    See, the problem is that each Model is related to one Make and each Trim is related to one Model.

    What is the best way to database the cars? Remember that each news article can relate to many cars, including different years of the same make and model.

    Any advice would be much appreciated.
    Thank you

  2. #2
    Join Date
    Oct 2003
    I'll start off by saying that I am not an expert programmer.

    I would say if you are limited on space then break it up otherwise leave it as is. It will be easier to program your DB with less linknig. Think about it if you break up

    Id: year: make: model: trim:
    001 2003 Honda Accord EX
    001 2002 Honda Accord EX
    001 2001 Honda Accord EX
    002 2003 Honda Accord ESX
    002 2002 Honda Accord ESX
    003 2003 Honda Accord E
    003 2002 Honda Accord E
    003 2001 Honda Accord E


    Id: make: model: trim:
    001 Honda Accord EX
    002 Honda Accord ESX
    003 Honda Accord E

    CarYearID CarId: year:
    0012003 001 2003
    0012002 001 2002
    0012001 001 2001
    0022003 002 2003
    0022002 002 2002
    0032003 003 2003
    0032003 003 2003
    0032001 003 2001

    your gain will not be that much. CarYear and your original car will have the same # of rows, you will only save space for 3 columns. This may have been an issue when you were working with 500mb HDs but with todays low cost HDs and today's DBEngines are designed to work with larger datasets than before, the size should not be an issue.

    Down side you now have to work with 2 tables instead of 1. You can now make more errors programming. Performance wise I don't know which is faster, but if I were to guess I would have to say the original may be faster since less linking. All your data is there.

    The school textbook answer would be to break it up, but I think that the practicle answer is to leave it as is.

    Hope this helps and Hope I'm right


  3. #3
    Join Date
    Oct 2003
    yes, it seems that the simple thing would be to leave it all in one table, but that is against what they recommend with database normalization.

    I do have plans to use the car database with other information too so I don't know what the best option is.

    I still need to learn more about interacting with the database, but I plan on entering all information through a web interface using PHP. I haven't figured out how to link everything to these tables yet.

    any other suggestions?

  4. #4
    Join Date
    Oct 2003
    Sorry can't help with PHP, its still on my to learn list.

    I rember I had learned some normalization formulas back in school. I also remember my teacher telling me that what he was about to teach we will never use in the real world. Remember that a database is information organized in a useful maner. You have to decide what is useful. I've seen many DBs that could have been normalized more, but then would not have been as functional for ex 2 tables

    Client_ID Client_Name Address TEL FAX

    InvoiceID InvoiceNo Client_ID ClientName Address TEL FAX ....

    Normalizing gives you

    InvoiceID InvoiceNo Client_ID

    Now you may think that the normalized way is better but what happens if a client changes his company name or address? If a client has invoices I1,I2 and after the first invoice he changes his information. In the first DB design I1 has his original info and I2 will have his new info, basically keeps a history of exactly how the data was entered. In the normalized method both I1 and I2 will have the updated info, which can cause future problems.

    Its good to know normalization because it will help you design a nice clean DB but at the same time you shouldn't go crazy with it. Its best to keep it in the back of your mind and use logic to help you design.

    just food for thought.

    good luck with your program

  5. #5
    Join Date
    Oct 2003
    Boston, Mass. USA
    An interesting link on normalization is:

    rongiul makes a good point. WHat works best in theory may not be the best for your situation.

    In the example given of invoices, if you normalize but you really need to know what address the client had when the invocie was issued, then you would need an address table

    Client_ID Client_Name, AddressID

    AddressID, Address (street/city/state/country etc.), StartDate, EndDate

    InvoiceID InvoiceNo Client_ID, InvoiceDate, AddressID

    Now if the client moved offices and you need to know which address a certain invoice went to thenyou follow the AddressID in the Invoice table.

    Now, a purer form of normalization would be even more complex:

    AddressID, Address (street/city/state/country etc.), StartDate

    InvoiceID InvoiceNo Client_ID, InvoiceDate

    Note there is no AddressID in the Invoice table. Instead you have to use the InvoiceDate and for that ClientID find the Address in the Address table that matches. Its an easy compare if the Address Table has a start and end date you want Invoice.InvoiceDate >Address.StartDate and Invoice.InvioceDate <Address.EndDate

    Howwever if the Address table only has a start date then you have to compare all the addresses for that client and find the right one. (Ill let you figure how )

    I always am in favor of both a StartDate and EndDate, even if it means that when the client moves the EndDAte of the old address is one day less than the start date of the current address. There are just too many deadlines to do it the longer way, and disk space is not THAT expensive that you cant do it the longer way.

    Unfortunately in the real world many businesses do not normalize much at all -- and do not have an address table. This is a problem when you mailorder something and have to give them your home address to verify billing with your credit card and then you give them shipping address. Unfortunately I have run into several times where they let you entera separate gift/shipping address ... but whoever wrote the program that produces the labels just pulls off the billing address and does not check for a shipping address. This makes little sense from the custmerviewpoint -- if they collect the shipping address why not use it? But it happens much too often.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts