View Poll Results: Which approach is better?
- 0. You may not vote on this poll
02-07-09, 11:53 #1Registered User
- Join Date
- Feb 2009
- South Florida
Field naming convention question.
I'm a PM and have been shown two different approaches on how to design a new DB for a new system we're building. Let me explain both examples and then I'll pose my questions.
Background info: This system will be handing well over 100,000 incoming transactions per month and even more transactions going out as well (same data that is coming in will need to be queried for the outbound). The # of transactions per month will only increase as the business grows. Each transaction will contain some demographic information about the user as well as some customized data that can be grouped into categories.
Design approach 1:
- Create a demographics table with fields like ID, salutation, first_name, middle_name, last_name, address_1, address_2, city, state, zip, phone_1, phone_2, email_1, email_2, and so on.
- Create category tables to house the custom data points that match to each category: the example category = "education" so create this category table with fields such as ID, demographicID, hs_grad_year, high_school_attend, gpa, going_to_college, and so on.
- Now, for the outbound data, we may need to send the data out using different field names so we would create a mapping table for what to use as the outbound fields (outbound fields can change depending on who we need to send the data to) so we can then create a client mapping table with fields like ID, client_field_name, internal_field_name. Take this a step further and create a table for internal fields to have just ID and internal_field_name so that in the mapping table you can refer to internalID instead of internal_field_name (save space in the DB).
Design approach 2:
- Have the system auto generate tables and fields as needed. All tables created will have field names in the table (aside from the unique ID field) labeled as field1, field2, field3, field4, field5, and so on. A table for demographics and each category will still be present just as in approach 1, but using this new field naming technique.
- For every table that uses this technique, a mapping table will need to be created to know what data is in each field to map to our internal field list so that a team of programmers can build reports off of the data.
- Then for data that is outbound, have another mapping table to know how to pass each of the field data to each client.
To me, approach #1 appears to be the more correct/stable way to design. Approach #2 I can see becoming a problem as we currently have never used this approach in the company and to get a group of programmers to quickly change to this design, understand it, and be able to run queries on it properly and efficiently without running into ambiguous field name errors.
My questions become:
- Has anyone designed DB's in the approach #2 manner that can help me to understand some more benefits to designing this way?
- For any other PM's out there, would you ever allow your team to design in a manner that you did agree with 100%? (the programmer that provided approach #2 is insisting that with the amount of traffic we plan to increase to, this is the only way to go and the most proper way)
Thanks all. I've been researching around here and elsewhere on the web and can't just get a solid direction. I completely understand approach #1 as it is how I've always known to design.
Thanks in advance.
02-07-09, 14:23 #2SQL Consultant
- Join Date
- Apr 2002
- Toronto, Canada
1. yes, some have
this generic approach, which embeds metadata (data about data, i.e. dictionary data, data describing what each field is for) in the database as another table of data, is often called "entity-attribute-value" or EAV
most database professionals would caution you not to take that route without years of experience in designing databases
2. i was a PM, for many years... gave it up, though, as it didn't agree with me
on a couple of occasions, i let someone build something that i thought was too risky, and in both cases it took ~way~ longer to get it working than traditional methods, and in one case never made it to production at all
your first approach is best
as a matter of fact, you don't need a mapping table at all -- specifically, all you need is a view, which collects data as appropriate (using complex joins, if necessary), and presents the results using whatever column names and datatypes you need for output
02-07-09, 15:29 #3Registered User
- Join Date
- Dec 2007
- London, UK
Have you considered using a Data Integration tool as part of the solution? There are various products available from Oracle, IBM and others. You don't mention what form the output for clients will take but the purpose of an integration tool is to do the transformations from a data source to a target. The tool is a repository for the mapping between the source and target and a platform for developers to create and maintaining those mappings. Look for features like lineage and impact analysis.
As for your PM repsonsibility, it depends on the role and the composition of the project team. I was a PM for 2 years and was very involved in design and technical problems. In my present company however PMs rarely get involved in technical decisions because we have technical architects for that.
02-07-09, 16:26 #4vaguely human
- Join Date
- Jun 2007
As it stands the 1st approach looks far better but then you also haven't said why the second approach was proposed in the first place - is there something you haven't mentioned? Is it purely the customisable data? If so then can you expand on your description of this part of the project.
It's rarely a good idea to be creating permanent tables on the fly and usually a poor idea to call them field1, field2 etc. However if you need to store parameters against a customer and you don't know what those fields may be then there are many approaches to take but I think we need more explanation before choosing between the approaches. The amount of traffic shouldn't make a difference but if the type of data changes with each and every customer then that makes a huge difference.
02-08-09, 19:44 #5World Class Flame Warrior
- Join Date
- Jun 2003
The most prudent thing to do at this point is track down whoever suggested option #2 to you and shoot them. Quickly and painlessly please. There is no reason to be cruel.