I am not sure this is the correct forum to ask this question in...or if it really is a question...read on and you will (hopefully) understand...

The task...
I am to design a web-based system and a DB to go with it that should be able to store arbitrary "Objects" with arbitrary properties of arbitrary "ObjetTypes". There should be a "PropertyMasterList" which contains all available properties. For each defined "ObjectType" there should be a defined set of "ObjectTypeProperties" from "PropertyMasterList".

This givest us these tables:

  • Object
    - There is a ForeignKey (FK) to ObjectType
  • ObjectType
  • ObjectPropertyValue
    - There is a FK to Object
    - There is a FK to PropertyMasterList
  • PropertyMasterList
  • ObjectTypeProperty
    - There is a FK to ObjectType
    - There is a FK to PropertyMasterList



In other word. We have a table where we can store objects in and another table where we can store property values for those objects. These two are our main tables. The other tables exist to define which properties are available for a specific object type (each object has an object type).



I have a problem...
Scenario 1: The end user (user) will get a webpage where he/she (from now on I will only write he) can select properties from the PropertyMasterList and enter search criteria for that property. When the user is finished he will press the search button and get back a list with objects that has properties matching his search criterias.

Senario 2: The end user (user) will get a webpage where he/she (from now on I will only write he) can select properties from the PropertyMasterList and enter search criteria for that property. For each property he selects from the PropertyMasterList he will also set a "SearchOption" for this property along with the search criteria. The "SearchOption" can have one of 5 values: 1. Must Match, 2. Important, 3. Default, 4. Less Important, 5. Must Not Match.
When the user is finished he will press the search button and get back a list with objects that has properties matching his search criterias ordered so those objects that match the most important properties are at the top of the list.

There are millions of objects in the database and eaxh object has at least 20 properties, some object types have 100 properties. So assume that the Object table contains 5 million records and the ObjectPropertyValues contains about 200 million records.

Since we have a user sitting in front of a webpage waiting for the result we must present the result in no more the 5seconds, preferable much quicker.

This whole problem description is a bit "Fuzzy" so try to read between the lines and understand what "they" probably want out of this.



The question...
The database design I have presented here is just to show an example of what the design might look like and to help you visualize how data might be stored. The database design can be changed in any way we want if it makes the problem easier to solve.

The interesting problem here is Scenario 2, which kind of makes this all a "ORDER BY" problem.

Lets say that we for each ObjectPropertyValue calculates a corresponding value indicating how well this ObjectPropertyValue matches the search criteria from 0=Must Not Match to 5=Must Match. Lets call this calculated value for SearchCriteriaMatchValue.

Now lets select and group by objects and sum up SearchCriteriaMatchValue for all objects who don't have a SearchCriteriaMatchValue = 0 in ObjectPropertyValue, and lets order the result by sum of SearchCriteriaMatchValue descending.

I have not tried this solution yet, so I don't know hw well it will perform yet.

OK, I think this approach would work, but is it the best way of doing it? Is there a faster way to do it? Is there a better way to design the database to solve the problem?

All suggestions, hints, tips and other feedback on this is much appreciated. And remember, the database is HUGE!

/H