Results 1 to 13 of 13
  1. #1
    Join Date
    Sep 2009
    Posts
    5

    Unanswered: car database problem

    Hi. I received a car database (MySQL tables) and I'm having a problem selecting what I want. It's a database with manufacturers, car models and their properties. For example Audi > A4 > with properties like air bags, ABS, 4 wheel drive, turbocharged... The standard thing you see on every car reselling site with advanced search.

    I'll simplify the tables for example purposes.

    Tables:
    "Brands": ID, brand_name
    "Type": ID, type_name
    "Cars": ID, brand_ID, model_ID, car_name
    "PossibleProperties": ID, property_name
    1 | Airbags
    2 | Fuel
    3 | Drivetrain
    4 | NumberOfDors
    5 | HorsePower
    "AvailableProperties": car_ID, property_ID, property_value
    1 | 1 | 1
    2 | 1 | 0
    2 | 2 | 'Diesel'
    2 | 5 | 120

    Every entry in Cars has ~100 properties assigned via AvailableProperties.

    And now my problem. I need to select all Cars that have certain properties. For example it must have `air bags`, must be `4 wheel drive`, must have a `V6 engine`, should have less than `2000ccm`...

    I know how to select Cars which are certain `Brand` and `Type`, but don't know how to find cars with all the other restrictions.

    I'm really new to this and I have no idea how to tackle this problem, so any help is appreciated.


    *Edit: Just noticed this might be the wrong section on the forum for this question. If so, sorry...
    Last edited by edmund22; 09-07-09 at 14:18.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    here's an example for finding all cars with exactly these three properties --
    Code:
    SELECT car_id
      FROM AvailableProperties
     WHERE property_ID = 1 /* airbags */    AND property_value = '1'
        OR property_ID = 3 /* drivetrain */ AND property_value = '4 wheel'
        OR property_ID = 9 /* enginetype */ AND property_value = 'V6'
    GROUP
        BY car_id
    HAVING COUNT(*) = 3
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Just to add to Rudy's post:
    • Some of these properties require an exact match ie NumberOfDoors while some are just specifying a minimum value ie horsepower - I might specify a car with 150hp but I don't want to exclude 160hp cars. You'll need meta data to cope with this.
    • Do you want to show the user 100 properties and expect him to wade through them or will there be important properties and a set of other less important properties?
    • Would you want to be able to request a Japanese (or German or American) car ie a single property might cover multiple values?
    • If the search function is the main feature of your business then it may be worth getting someone in to do this part for you (if SQL is your strong point then please ignore this point).
    • Is car sales a good market to be getting into at the moment (it's died here in the UK)?

  4. #4
    Join Date
    Sep 2009
    Posts
    5
    Thanks for the replies.

    The code works. It finds the requested cars.
    But like mike_bike_kite said, there are more issues to it. Made me think if the database organisation is the right one for this. I got it from outside, but I could transform it to whatever would suit my needs better.

    And for all I know this is not limited to car sites. For example mobile.de - Search and Buy New Cars, Employees' Cars, and Used Cars has a load of properties to search for, similar to what I need.
    Similar or even same thing you might see when e-shopping. There you have categories of products, where each category has different properties attached to it. For example computer monitors have things like size, resolution, brightness, weight... While hard drives have capacity, speed, size... And then you can search through or filter out by these properties.

    I guess this is a similar (or same) database structure concept. "Attached properties to objects". Don't know how to call it differently. But from all I know, there is a table which hold properties, another table which holds possible values for those properties, an object table and a table which maps objects and properties (with their values) together. Maybe groups for the properties. Or am I wrong?

    Like said, I'd rather transform the whole database I got, than to run into problems with SQL because of the misconception of the DB structure...


    The things I need are:
    1. searching through object which have certain properties, exact values, ranges, and "a single property might cover multiple values" (like mike said), with sorting by the important properties
    2. output of objects sorted by category, with most important properties output (car brand, model, price, color...), with sorting
    3. output of single object with all the properties displayed

    I guess there must be an tutorial, example for such a concept. Database diagram + some SQL examples would be superb. I'm searching the internet, but couldn't find anything useful yet...
    Last edited by edmund22; 09-08-09 at 06:36.

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by edmund22
    I guess this is a similar (or same) database structure concept. "Attached properties to objects". Don't know how to call it differently.
    it is known as "entity-attribute-value" design or EAV, and it is a very poor design choice

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Sep 2009
    Posts
    5
    Quote Originally Posted by r937
    it is a very poor design choice
    Yeah I noticed this during the past few days. So is there a better solution how to achieve this?

    I can't imagine having a table with all the columns that you need to describe all the attributes needed is the solution. I would end up with a table which has 300 fields and 95% of them are NULL.

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by edmund22
    I would end up with a table which has 300 fields and 95% of them are NULL.
    and the problem with this is... ?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Sep 2009
    Posts
    5
    Flexibility...

    Imagine I have a "car" table, with all the features as fields. And then I have to add a new gadget, like GPS or something. I need to modify the table, the code, the output templates...
    I guess the key point here is that it's a lot of work and it looks ugly. It hurts me as a programmer to see such things

    I guess this is very true.
    Being a natural problem solver, and knowing that code re-use is a “good thing”, the developer thinks he can do better. Quite often, there is no one more experienced around to advise against, and so the developer implements his ideas
    So your advice would be, to stick with the 300 fields, do it the "normal" way and stop worrying about "the doing it better" part?

  9. #9
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by r937
    it is known as "entity-attribute-value" design or EAV, and it is a very poor design choice
    I know I'm going to regret this but ...

    The database that the OP has is already an EAV design and the SQL you provided was typical SQL against an EAV database. The world he's trying to model is quite simple - it holds data around cars (the entities) with a bunch of properties (the attributes) that have certain values (green, SUV etc). Isn't this what EAV was built for?

    Sure most folks frown at EAV systems for various (valid) reasons but if the world you're trying to model fits it perfectly then why go to the effort (and cost) of building a traditional database and creating bespoke screens, searches etc when the user has a system that already fits.

    The problem isn't the fact you end up with 100's of fields or tables in your database, it's the fact you have to write code to support and access 100's of fields and tables. With EAV (for this application) you're just supporting an attribute and it's value - the allowed attributes would be listed in one table along with meta data about that field ie it's a main search field, the allowed values for any field would be listed in a valid values table which would also be used by all the screens and feeds.

    If all those fields are just being treated as a simple property of an entity ie the colour is blue or the car type is SUV then why not treat them as a named property. It also makes it pretty easy to add or delete fields at a later date - something which is a royal pain in traditional databases.

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by edmund22
    Imagine I have a "car" table, with all the features as fields. And then I have to add a new gadget, like GPS or something.
    those would be two separate subtype tables, where the supertype table would be Products_for_Sale or something

    do a search on supertype/subtype

    as for EAV, we all understand that EAV sure looks nice as a storage mechanism

    it is formulating meaningful (and efficient) queries that is the hard part

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by r937
    those would be two separate subtype tables, where the supertype table would be Products_for_Sale or something

    do a search on supertype/subtype
    But won't you have to change the structure of the database at some point to add a new attribute?

    Quote Originally Posted by r937
    it is formulating meaningful (and efficient) queries that is the hard part
    I thought your query above was pretty good and very readable.

    This thread gave me an idea for a new type of search on my own little EAV project. The idea was to search for items that match on multiple fields or supporting entities ie a "Japanese silver SUV". It wasn't difficult generating a query that would match individual words in the string against supporting attributes or entities. My db holds data about countries so I asked for "UK islands in the Caribbean" and it gave me the following. The top items seem to match everything while the lower items were poorer matches.
    Code:
    Turks and Caicos Islands
    British Virgin Islands    
    CAYMAN ISLANDS       
    MONTSERRAT             
    ANGUILLA                  
    Bermuda                    
    FALKLAND ISLANDS      
    SOUTH GEORGIA AND THE SOUTH SANDWICH ISLANDS
    U.S. VIRGIN ISLANDS    
    Puerto Rico                 
    GIBRALTAR                  
    GUERNSEY
    I was very pleased with the results and I'll add the search feature in permanently into the system. It was also instant on my admittedly small db of 100k items.

  12. #12
    Join Date
    Sep 2009
    Posts
    5
    First of all thanks to you two. Was nice to see two opinions on this.

    I did some extensive research on EAV (now that I know how it's called ). Figured it's like the most such things, there are two camps, one that say this is wrong, and one that admits it's not the best, but necessary.

    I still have time to decide what to use until Monday, but I'm tending to use EAV for the car properties and normal relational tables for everything else.

    I figured out how to find objects with certain attributes (enhanced r937 example from first reply), also how to sort by the most important ones. This gives me the search I need, also the ability to list objects in categories with sorting.
    Since I always planned to use separate table for values, which will be linked in the m:n table between objects and attributes, it's easier to search for things, since you only relate to the IDs of values instead of values (which would complicate things by a lot). This way you can add values like "0 to 50.000 km", and only search for the ID when a user selects it from a SELECT list.
    Maybe I'll even store a generated TEXT value in the object, for quick output of all of the attributes, which will also be searchable and will come close to what mike wrote in the last reply.

    Oh and the most important thing. It seems like these operations are only ~10% slower than the use of normal tables, which is fine by me. I rather take the CPU time than maintenance cost.

    Thanks again for the help.
    Last edited by edmund22; 09-09-09 at 04:56.

  13. #13
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I really liked the idea of typing "green diesel SUV" into a search box and getting good results. So I did some work last night and implemented it on my demo EAV system. My system holds data on countries rather than cars but the principal is the same - just match and join attributes and entities against the words to get the best matches. I've listed a few examples I tried:

    uk island - searches for islands under control of the UK
    • Turks and Caicos Islands
    • SOUTH GEORGIA AND THE SOUTH SANDWICH ISLANDS
    • FALKLAND ISLANDS
    • CAYMAN ISLANDS
    • British Virgin Islands
    • MONTSERRAT
    • JERSEY
    • GUERNSEY
    • GIBRALTAR
    • British Indian Ocean Territory
    • Bermuda
    • ANGUILLA

    french africa islam - searches for countries speaking french, in Africa who's religeon is Islam
    • SENEGAL
    • NIGER
    • MALI
    • GUINEA
    • CHAD
    • BURKINA FASO

    uk under 10000 - searches for countries under control of the UK with under 10000 people - numbers are currently tied to the population field.
    • SAINT HELENA
    • PITCAIRN
    • MONTSERRAT
    • FALKLAND ISLANDS
    • British Indian Ocean Territory

    dictatorship asia - searches for dictatorships in asia
    • VIETNAM
    • TURKMENISTAN
    • SYRIA
    • NORTH KOREA
    • LAOS
    • IRAQ
    • CHINA

    dictatorship ruhbarb asia - ruhbarb matches nothing so that part is ignored and the results are the same as above.
    • VIETNAM
    • TURKMENISTAN
    • SYRIA
    • NORTH KOREA
    • LAOS
    • IRAQ
    • CHINA


    I thought the above results were quite interesting - sure the system isn't perfect but it will work with any data without needing to recode so if the system held car data then typing "green diesel SUV" would list what you'd expect. I'm guessing that trying to do this in a normal database design would be quite difficult.

    The above isn't meant as an attack on standard database design but is just trying to show that EAV does have a few benefits to certain types of system - typically those where you have a bunch of changing attributes on a group of entities. You're welcome to try these examples out on my little system but remember that words must exist somewhere in the database (ie yanks won't match against americans) and that it will only match against one word at a time (ie United States is treated as 2 words) so you'd want to use USA in this case.

    Mike

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •