If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > any advantage to minimize number of tables?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 05-25-09, 14:15
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
any advantage to minimize number of tables?

Hey all,

I come mainly from an embedded background. I took database design courses at university getting my engineering degree, so most my db knowledge is fairly academic.

Anyways, I'm working on a project interacting with a database. I find it's design a bit weird. It essentially stores a list of counters for tests.

Test1
counter1 at time1
counter2 at time1
...
counterx at timen

Test2
counter1 at time1
counter1 at time2
...
countery at timen

Now, the way the DB is arranged looks like they tried to minimize the number of tables. So the information is stored in one big 'counter' table with the following columns.

counterid, test, time, counter, value
1, Test1, time1, counter1, value1
2, Test1, time1, counter2, value2
---

I would think this would be better represented by:

Test Table (1 table, with 1 entry per test):
testid, test name, counterTable

Test Counter Table (1 table per test, with one entry per counter):
counter, CounterValuesTableID

CounterValuesTable (1 table for every test for every counter)
time, value

---

Is there any reason you would want to have all the counters and values in one giant table? Seems to me it would totally muck up performance. Is it perhaps simpler to view the data from various tools if there are fewer tables?

Also, what is the best tool to view/manipulate databases from a programmers point of view. Right now, I just use visual studio 2008 to connect to the DB and view/edit its contents.

Thanks all,
Reply With Quote
  #2 (permalink)  
Old 05-25-09, 14:32
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
one table per test is really poor design, and one table per test per counter would be a total disaster

performance on large tables is optimized by the use of indexes that are appropriate to the nature of the queries to be performed

visual studio works on only a small number of database systems (where small means 1), and there's a good argument for a front end tool that works with multiple database systems

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 05-25-09, 14:59
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
why would having a separate table per test be really bad?

Isn't that what db normalization (if memory serves me right) is all about?
Am I wrong to think of DBs as giant hash tables?

If I'm go to say get the counter values for a particular test, I only search the relevant data if everything is in its own table.

If everything is in one table. I have to search through the entire table looking to match the testID and the counterID.
Reply With Quote
  #4 (permalink)  
Old 05-25-09, 16:11
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Thinking from an embedded perspective, if you have to send bytes out ports and there are only 500 valid ports, how much sense would it make to code separate routines to send all 256 possible byte values to all 500 ports? You'd only have 128000 routines to manage, and each one would be exactly what you need.

One aspect of normalization works like subroutine (or macro) parameterization. By factoring out the common attributes like counter, test, and time you have one table that can store all of your tests instead of potentially thousands of tables. This makes the system much easier to both understand and manage, as well as eliminating lots and lots of wasted space.

-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
Reply With Quote
  #5 (permalink)  
Old 05-25-09, 17:29
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
I don't really get your examples but then I don't know your data. One point that might be relevant is The Principle of Orthogonal Design (see "Date on Database" by Chris Date, published by Apress). This is the idea that (roughly speaking) it ought to be impossible for the same tuple to appear in more than one place in the database. It's a practice that can minimise the amount of logic and processing required, even though that's not Date's main argument in its favour.

Like other design rules this is "formalised common sense" I suppose. On occasions I have consciously applied just such a principle in design, even before I knew of it as a formal rule by any name.
Reply With Quote
  #6 (permalink)  
Old 05-25-09, 17:38
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
dportas and pat,

Yes, that is what I was thinking. DB normalization would be good.
The DB would be better if it were normalized... having a table per test... at the very least.

This is in contrast to how it currently is, with one giant table with lots of duplicated data.

What I am curious about is why r937 suggests the use of one large table.
"one table per test is really poor design, and one table per test per counter would be a total disaster"

This translates to: "normalization would be a total disaster"

Unless my examples were misunderstood.
Reply With Quote
  #7 (permalink)  
Old 05-25-09, 18:19
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Quote:
Originally Posted by scamper_22
What I am curious about is why r937 suggests the use of one large table.
"one table per test is really poor design, and one table per test per counter would be a total disaster"

This translates to: "normalization would be a total disaster"
More tables doesn't necessarily mean "more normalized" and one table doesn't necessarily mean "less normalized". The regular Normal Forms have nothing to say about having possibly redundant tables. For example if the schema:

R{A,B}

is in 5NF then so could be the schema:

R1{A,B}, R2{A,B}, R3{A,B}, ... etc

You can't say that one is "more normalized" than the other.

However, according to Date, if it's the case that my second example allows the same tuple to appear in more than one relation then the former design with just ONE relation is a "better" one. If you don't follow that rule then you may be forced to implement the same logic in three or more places.

I'm not assuming anything about which design is better in your case. Just my 0.02.
Reply With Quote
  #8 (permalink)  
Old 05-25-09, 20:29
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
Hold your horses there Newt! I might not have made myself clear.

My analogy to embedded programming was if you had 500 ports that you might possibly need to send bytes. Having the 128000 separate routines (one for each byte value going to each of the 500 ports) would be like having separate tables for each test. One routine with two arguments (port number and byte value) would be like having the single table with columns to describe the applicable data for each row.

There is no way on earth that I'd code and manage 128000 routines or 128000 tables when just one of them with appropriate atributes (or arguments) would work. I'm way too lazy to tackle that just for the amusement value!

-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
Reply With Quote
  #9 (permalink)  
Old 05-27-09, 10:43
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
alright, let me provide a bit of a clearer example. It's a simplified version of the one above.

There are lots of test. Each test contains a like of values at a time.

Suppose we have the following table.


testID, DateTime, value
1, 12:03, 100
1, 12:04, 101
1, 12:05, 102
...
2, 1:30, 200
2, 1:45, 300
..

So suppose I often do a query to get all the values for a test.
With all this in one table, it would mean it would have to search through this giant table finding all the rows with a matching testID.

If however it was in different tables... for example... one table per test.

TestTable
TestID, TestTableName
1, table1
2, table2

Table1
dateTime, value
12:03, 100
12:04, 101
12:05, 102
...

Table2
1:30, 200
1:45, 300

This way, It would be much easier to get the values for a particular test... they are all in one table for the test.

Does this make the question clearer?
Reply With Quote
  #10 (permalink)  
Old 05-27-09, 10:52
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by scamper_22
With all this in one table, it would mean it would have to search through this giant table finding all the rows with a matching testID.
no, it does not mean that

you are not familiar with database indexes?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #11 (permalink)  
Old 05-27-09, 11:15
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
No I'm familiar with indexes... but why would you rely on them instead of just organizing the data in a logical manner

I'd imagine backing the index would be a another hidden hash or table.

Is it basically a consensus that in general, this is how these sorts of databases are designed... big tables relying on indexes?

I don't have an issue with it.. .just found it curious. It would save me some headaches if I had to design it. Instinctively, I would have designed as as separate tables :P
Reply With Quote
  #12 (permalink)  
Old 05-27-09, 16:26
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by scamper_22
No I'm familiar with indexes
Be carefull when designing your index - at first glance it looks obvious to have a compound index on testid and DateTime but if you do only have 2 values for testid then the index will never be used. Most optimisers will decide that because using testid will only reduce the search by 50% then it's not worth using the index at all. If you have lots of test ids then this point can be ignored. Also DateTime is likely to be a keyword so a bad choice for a field name.
Quote:
Originally Posted by scamper_22
but why would you rely on them instead of just organizing the data in a logical manner
What would you do if someone came along with another 5 tests? or another 25 test? Having separate tables would mean you'd need to create new tables each time, the code (to insert, search and show the data) would probably need to be altered to be able to use these new tables. All this effort for no performance gain at all.
Reply With Quote
  #13 (permalink)  
Old 05-27-09, 18:18
scamper_22 scamper_22 is offline
Registered User
 
Join Date: May 2009
Posts: 8
"What would you do if someone came along with another 5 tests? "

you add new tables... that's a given. I don't see the code being that complicated.

On a new test, you just create a new sql table programmatically. One extra step in the creation.

The rest of the code would not change. Searching, showing data... why would they change at all?

The table names for each test would be stored in a main table
testid, CounterTable

Last edited by scamper_22; 05-27-09 at 18:23.
Reply With Quote
  #14 (permalink)  
Old 05-27-09, 19:39
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by scamper_22
"What would you do if someone came along with another 5 tests? "

you add new tables... that's a given.
but that's a poor design

consider the query for finding the average of several tests -- the more tests, the clumsier this query
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #15 (permalink)  
Old 05-27-09, 19:51
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by scamper_22
"What would you do if someone came along with another 5 tests? "

you add new tables... that's a given. I don't see the code being that complicated.

On a new test, you just create a new sql table programmatically. One extra step in the creation.

The rest of the code would not change. Searching, showing data... why would they change at all?

The table names for each test would be stored in a main table
testid, CounterTable
So your table lookup table would be a little like an manually coded index ie you lookup the key value, find the subset of data then read the data. Not bad apart from the fact you'd need to :
  • code all this yourself
  • it would be slower than the inbuilt indexes
  • you'd have to maintain this new table
  • you'd end up with more complex code throughout your application (for instance try grabbing all the test results for today across 25 tables)
and I suppose after doing all this you'd get almost the same functionality as using normal indexes which are available to you for no effort at all!

Out of curiosity would you use indexes in your proposed separate test tables or would you split these tables again into perhaps sub-tables for each hour/minute/second ?

Mike
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On