If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > multiple M:M subtables or one big table?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-10-06, 00:12
redrisker redrisker is offline
Registered User
 
Join Date: Jul 2006
Posts: 4
multiple M:M subtables or one big table?

I am currently designing a database for a small business to track documents. Their scheme of issuing documents for every job is so complex that I ended up having a number of many-to-many "intermediary subtables" with consists of a mapping of the primary keys of two tables:
e.g.
Code:
receipt_invoice
-----------------
receipt_no
invoice_no
In fact, i have 15 of these tables. As I said, the design was a complex one to accommodate future modifications in the way the physical documents are handled. Therefore, some of these tables may hold only 1 row for the next year and some may not even contain a row up until the time the owner decides to re-design the database.

Owing to the complexity of the database design and to the number of tables to be tracked, I was thinking of doing an optimization:
Code:
table_mapping
---------------
initial_id
secondary_id
mapping_type
where mapping_type shall indentify which kind of primary keys initial_id and secondary_id are. For example, if initial_id is an invoice number and secondary_id is a receipt number, mapping_type would be INV-RCPT. Unique codes shall be used for the other mapping types.

I am seriously thinking of adopting this scheme. Comments will be highly appreciated!
Reply With Quote
  #2 (permalink)  
Old 07-10-06, 05:07
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
Why? Simply so that you hve less tables? But you will have thrown away referential integrity (foreign keys), and your queries will be no simpler (in fact, somewhat more complex). So what is the real benefit of doing this?
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #3 (permalink)  
Old 07-10-06, 06:34
FAC51 FAC51 is offline
Registered User
 
Join Date: Jun 2004
Posts: 127
Quote:
Originally Posted by andrewst
your queries will be no simpler (in fact, somewhat more complex).
Tony can I ask is this because forming queries that mainly comprise of self joins increase complexities in terms of the query statement size & additonal use of outer joins etc?

i know from my own experience that self joins also hinder performance - because of serialization.
Reply With Quote
  #4 (permalink)  
Old 07-10-06, 07:58
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
That could be a factor perhaps, but it wasn't what I meant. I meant that instead of this:
Code:
select r.x, i.y
from receipts r
join invoice_receipts ir on ir.receipt_id = r.receipt_id
join invoices i on i.invoice_id = ir.invoice_id;
... one now has to write this:
Code:
select r.x, i.y
from receipts r
join table_mapping ir on ir.initial_id = r.receipt_id
join invoices i on i.invoice_id = ir.secondary_id
where ir.mapping_type = 'INV-RCPT';
That's one extra WHERE condition per M:M link. Also, it's now easier to make a mistake: have I used initial_id/secondary_id the right way round?
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #5 (permalink)  
Old 07-10-06, 08:17
FAC51 FAC51 is offline
Registered User
 
Join Date: Jun 2004
Posts: 127
yes i agree. I also find that the use of outer joins will be a more of an occurance if im querying an all-encompassing single table - again due to filtering, similar to your previous point - although this is specific to my own experiences.
Reply With Quote
  #6 (permalink)  
Old 07-10-06, 12:21
redrisker redrisker is offline
Registered User
 
Join Date: Jul 2006
Posts: 4
thanks andrew! well, yes, aside from the trivial benefit of having less tables, I was thinking of the occurence of some tables being wasted if not used. As I said, some of the tables may hold only 1 row for the next year and some may not even have the benefit of having any row at all.

Another thing is, the application will be heavily using a function to check if a particular M:M relationship exists. The scenario is actually like this: an order may have an invoice, and an invoice may have a receipt. Also, an order may be issued a receipt directly. The order is paid if a receipt for it exists. The M:M tables would be:
Quote:
order_invoice
invoice_receipt
order_receipt
To see if the the order has been paid already, I have to see if an M:M for the particular order exists in the invoice_receipt or in the order_receipt table. If I have used the one-big-table approach, I would only checked one table only; whereas in this one, I first have to check invoice_receipt and if it does not exist, check order_receipt.

This scenario repeats at least 5 times on different combinations for the project.

And also on this note, I would like to ask: in balancing good database design (avoid redundancy, avoid null, etc) versus speed in accessing data (redundant data exists but is taken care of by the application's front-end), where and how do we draw the line?
Reply With Quote
  #7 (permalink)  
Old 07-10-06, 13:12
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
It seems that you are being driven towards a super-generic, everthing M:M everything else model because the requirements have not been clearly analysed. It is difficult to comment with so little background knowledge, but for example if the existence of one receipt associated with an order means that it has been paid then what would be the point of creaing a second receipt for that order? In other words, it seems that each order may be associated with just one receipt.

It still isn't clear to me how having one table simplifies the problem. To see if an order 123 has a receipt the SQL will be something like this with your approach:
Code:
select 1
from   table_mapping or
where  or.initial_id = 123
and    mapping_type = 'ORDER-RCPT'
union
select 1
from   table_mapping oi
join   table_mapping ir on ir.primary_id = oi.secondary_id
where  oi.primary_id = 123
and    oi.mapping_type = 'ORDER-INV'
and    ir.mapping_type = 'INV-RCPT'
... versus:
Code:
select 1
from   order_receipt
where  order_id = 123
union
select 1
from   order_invoice oi
join   invoice_receipt ir on ir.invoice_id = oi.invoice_id
where  oi.order_id = 123
I wouldn't be unduly concerned about creating a few 2-column tables that have no rows or 1 row - they aren't going to break the bank, are they?

Generic database designs are something I haved ranted about before here.

Regarding redundancy vs. speed: in a typical OLTP database (i.e. not a data warehouse) you must avoid redundancy and the associated potential for data corruption during updates. Performance should normally be fine with a normalised, non-redundant database design provided appropriate indexes are created, optimiser stats gathered, etc.
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #8 (permalink)  
Old 07-10-06, 22:41
redrisker redrisker is offline
Registered User
 
Join Date: Jul 2006
Posts: 4
Actually, there can exist more than one receipt for every invoice or order as partial payment is allowed. But you hit the point: OTLT. I guess I'm just not comfortable with a lot of database tables for a very simple project. But thanks for the help and will now start reading your blog
Reply With Quote
  #9 (permalink)  
Old 07-11-06, 05:15
andrewst andrewst is offline
Moderator.
 
Join Date: Sep 2002
Location: UK
Posts: 5,171
You won't find it hard to keep up - I only add a new entry about twice a year!
__________________
Tony Andrews
http://tinyurl.com/tonyandrews
Reply With Quote
  #10 (permalink)  
Old 07-14-06, 07:10
jfabasques jfabasques is offline
Registered User
 
Join Date: Jul 2006
Posts: 3
database design

to avoid later problems on the use of so many tables, you must fully understand the overall need of the company. This can be avoided with proper gathering of data before design of various entities needed with proper normalization needed with thorough knowledge on how they relate with each other. (entity relationship)

Later problems on design can be avoided if the real problem is fully understood and customer needs is fully met.

If you want to make a new design, please take time to investigate all other possible needs of your clients.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On