design question -- lots of little tables or one big table with nulls
I'm designing a database to store data from my Suunto T6, which is a heart rate monitor, stop watch, altitude and distance measuring device used for training. The watch records data at either 2 sec or 10 sec intervals. Depending on the user's configuration, each of the parameters might or might not be recorded at each time (for instance, altitude won't be recorded if the watch is set in barometer mode and distance won't be recorded if the user doesn't have the foot pod). If I used one table with a row for each sample and a field for each parameter I would potentially end up with lots of NULLs. What about using a bunch of tables that are joined using a SampleTime table like
The application I'll build on top of this will be plotting the data and doing some statistics. I'm worried about the potential cost of doing these joins. The only other approach I can think of though is to have a flat table and leave NULLs for the values when altitude, distance, or whatever isn't collected. I'd like to leave the flexibility for importing data from other watches that might have other sensors (e.g. GPS) if I ever get that ambitious. With the one table approach, I'd add columns (potentially with null values). In this approach, I'd add extra tables.
Any help would be greatly appreciated. I'm relatively new to the database thing (usually analyzing data in a bunch of flat files) and I'd like to get it right before I start coding the GUI, graphing, and statistics.
Another approach would be to have a log_entry table like
log_id (fk), athlete(fk), sample_id(fk), value, time. A second table would contain the different sample types (altitude, heart rate, etc.) and log information (date, start time, exercise protocol, etc. ) and athlete information (name, etc.) would be in additional tables.
Thanks for the responses! That helps a lot. For now I'm going to go ahead with the one table with nulls for samples and continue designing the tables along those aligns. If I could bother you with another question (or two) that came up while I was working through this. Here's what I have so far
Fields: rrinterval_id (pk), log_id (fk), beat_number, interval
Notes: RRinterval is the time (msec) between heart beats. This is what the watch actually records.
Fields: sample_id (pk), time, distance, altitude, heart_rate
Notes: heart_rate is calculated from the RRinterval
Fields: split_id (pk), log_id (fk), time, distance, altitude, heart_rate
Notes: Split has a list of the time the user hit the lap button (or decided to add a split in the application I'll build). The rest of the fields are interpolated from the Sample table.
Fields: lap_id (pk), start_split (fk), stop_split (fk), start_landmark (fk), stop_landmark (fk), stats here or in a separate table?
Notes: Lap describes the event where the runner goes from one landmark to another. It might be from start to end of the route or mile marker 2 to 3 for example. Somebody who runs regular routes might want to compare "laps" from one run to another. If your first 5 miles, say, of one route is the same as the first 5 miles of another, you might want to compare these laps.
What's bothering me here is the Lap table. The way it's constructed, I could put start_split, stop_split from different logs and start_landmark and stop_landmark from different routes, which doesn't make sense. I suppose I could create a constraint somehow (I'm not sure how), but should I solve this structurally (sorry if that isn't the right word) by splitting the the split and landmark key into a composite key like
Fields: log_id(fk, cpk), split_number (cpk), time, ...
Of course I'd still have to impose some contraint on start_split_number, etc. A composite foreign key? Perhaps I'm making this too hard and I can just count on the application to make sure I don't have laps with splits and landmarks from different logs and routes.
I'm following myself up, but I realized that the Split table could have a landmark (fk, possibly NULL) attribute. A split is a point in time during a run and the fact that you were at a particular landmark along a route is an attribute of the split record. That simplifies the Lap table to just the fields: lap_id (pk), start_split (fk), and stop_split (fk).
There might not be any reason to be concerned with restricting laps to belong to the same same log. I could make a lap that started on 1 April and ended on 30 April say and it would make sense to calculate stats on that lap like total time, total distance, average heart rate, etc.