I'm currently working on a project for a Classifieds ad site and I'm having a bit of trouble deciding on the structure of the database. The problem is as follows;
You have a category, but as the designer you do not know what these categories will be or what information is associated with these categories and want to leave it up to the user to determine this. i.e.
Create Category -> Add Custom Fields to this Category
Now so far the only solution I can think of is to create a table in the database for each category when you actually create the category. This would work but for some reason doesn't "sit right" with me and I feel there must be a better more efficient way of doing this.
I've seen sites with registration systems that include one large text/xml column in the database.
Could call this column "meta_data" or something similar. Then within this column is stored a data structure that represents various additional data. The flexibility comes from the fact that the database structure is rigid and the same for every user, however the contents of this column is open to the interpretation of the front end or other system.
Here's an example:
Say you have one user login system and two clients that have different business requirements for user profiles.
Database structure is as such for the users table:
I think you will be able to control some of the performance issues by adjusting the size of the xml/free form database column. Then you can limit the amount of data transfered to and from the database.
If you control the amount of data that can be stored, then I suppose the next area to take into account would be the front end performance in parsing/interpreting the data that the application sends and gathers. If the application is able to store the data efficiently, and read it efficiently; you should be in good shape.
If I may make a suggestion, I would stay away from creating a separate table for each dynamic category. I say this for two primary reasons.
For one, in the long run the maintenance of having an user delegated amount of tables could become a 'maintenance nightmare.'
Secondly, if you are using these tables in joins when selecting from the database; database performance will drop significantly in relation to the number of joins the database must perform to gather data. A general synopsis is that a one table select statement is more efficient than a two table select statement.
What database system are you using? Do you know if it supports an Object datatype? You may be able to build category objects in your application to represent these dynamic categories which contain the flexibility you're after and then store and retrieve category objects...?
The experience I have had with a system like this was a PostGRE database system, that had a boxed login system across multiple clients. Each client had their own database, but the login system was identical across all clients. The front end applications determined the user profile data that would be stored to the profile column (which had a length of 1024 bytes). One of the larger clients grew to 5000 national users, each with a required profile. On the PostGRE DB system, we didn't see any notable performance issues at the database level -- we did however see performance issues in the XML parsing on the front end. However, these issues were traced back to the XML parser, and it really didn't relate to the size of the profile or it's complexity. It was a new XML parsing library.
Again thanks for the response, the issue you raised about the joins when using a tonne of tables is probably why it didn't sit right with me. I'll be using .Net (ASP/C#) along with MS SQL server and I've been wanting to learn and deal with XML any way so I guess it's a good of time as any!
A question though when it comes to querying the table, if you're finding a certain category would you do this through SQL on the database end using say "LIKE" or does this get done on the application end by pulling all the records?
In my experience, we were using a text based column in PostGRE to store the XML profile data, so a LIKE command could (and did) work. It was a great way for us to find members by doing a comparison based on some of their profile data. For example,
WHERE profile_data LIKE '%<firstname>Ben</firstname>%'
I think that the newer SQL Server editions have an XML column type. I'm not sure if a LIKE statement would correctly gather data from an XML column... Perhaps XSL Transformations would work? When we really wanted to manipulate profile data, we used XSL Transformations to get and put data into the database. In our case, it worked great and allowed for some additional abstraction (though we could have done without XSLTs altogether -- see below). However, if SQL Server has an XML column type, it may be tightly integrated with XSLT.
In your case, I would certainly recommend finding the SQL syntax to search the database PRIOR to returning a result set to the front end. Regardless of what you are looking for, I'd take as a principle that the database would do a faster search than any front end would. That and, by filtering the results prior to returning a result set -- you also lessen the network traffic!
But don't let me send you on a path to an overly complex solution! Some of us constructed the XML data by just putting strings of data together to form the XML document (me). As long as the string parsed fine, then all was great! I honestly don't know if MS SQL will play nicely with this if you decide to go down the XML column road.
Best of luck with your dynamic column, and XML! The world needs more XML literate people I reckon, I'm still resisting . My XML skills (probably shouldn't even claim xml skills) are very basic.