Looking for a little help in designing a database. I'm going to be aggregating a LOT of demographic data, covering everything from crime stats to student/teacher ratios. This data will be queryable via geography, time, and possibly other attributes specific to the data type. This will all be made available for researchers to pull from in an ad-hoc fashion via the web.
Normalizing this data so that metrics specific to "county A" are accessible, or for "year X" isn't difficult. The hard part is of course doing queries to any subset of tables that relate to the table containing county names or school districts, etc.
This seems like a case for a data warehouse, creating fact tables for each metric and dimensions for geography, time, whatever. It seems that this type of analysis is what OLAP databases are designed to do. However, data warehouses are never represented using anything other than sales, order, store location data, which doesn't have a lot in common with data having much larger hierarchies, for instance. So I am basically asking if a) a data warehouse structure is indeed the recommended approach, or b) if there are alternative designs or even systems that might be able to perform this job in an easier way. Thanks for any help you guys could pass me.
I see no reason you shouldn't be able to use a regular, normalized database schema.
And, by the way, "fact" tables are NOT synonymous with data warehouses. A data warehouse does not have to be set up using fact tables or star schemas. Fact tables are more likely to be found in data marts masquerading as data warehouses.
If it's not practically useful, then it's practically useless.
- OLAP cubes create aggregate values automatically, as opposed to me dealing with it myself via views or sprocs or generating extra columns somewhere.
- Querying via dimensions looks like it's going to be less of a headache than generating different PIVOT statements based on each table.
Both of those jump out at me as things OLAP will provide over a normalized schema. I'm posting because I am open to ideas, and I do realize you can do this via a normalized schema, as that is what I'm doing right now - I'm just looking for what might be more optimal solutions.
And the fact table argument seems a little bit of semantics here - every bit of documentation on data warehousing seems to reference facts, dimensions, attributes, etc. and I think I got my point across...?