I apologize if this has been answered elsewhere. It’s probably an age-old question but I couldn’t find any information on it after doing numerous searches on this forum.
What is the best practice in determining a table-to-tablespace ratio? I’ve heard arguments promoting a one-table-per-tablespace configuration because of Db2’s locking mechanisms, backup/recovery, and general performance alternatives (i.e. bufferpool assignments). Is this true?
On the mainframe (OS/390) it's a lot more important to have one table per tablespace.
Things are a little more complicated on UNIX since you can have SMS or DMS tablespaces, and Raw drives can also be used. Generally, it's not a problem to have multiple tables per tablespace, especially if the tables are not large, but there may be different perspectives on this issue.
But it is generally better to have multiple containers per tablespace, with each container placed on a different physical drive.
I have a very small db in mind and have been using DBS for the most part with multiples tables per tablespace. I was just wondering if there were any obvious advantages in having one table per tablespace that would warrant changing my configuration. Most of the tables are under 100M so I'm not sure if I'd gain anything.
There are advantages and disadvantages either way. If you are autoloading tables then it is best to leave them in separate tablespaces so that the availability of one is not impacted by the other. Also, it gives you the ability to tune the tablespaces to each table if necessary. On the flip side, having tables grouped in tablespaces reduces file handling overhead when reading data into or out of bufferpools - this may get to be significant if a high level of concurrency is needed and for an extremely large system like a DW, it can sometimes be an issue.
Backups and restores are done at the tablespace level, not at the table level. So if a table is corrupted or there's a data issue with a table, and you decide to restore it, then all the tables which reside in that tablespace are restored to the point in time for the restore - you could decide on the groups logically so that this is not a big issue. But you really must weigh it into your decision.
Bufferpools are assigned at the tablespace level too so if you're looking for buferpool separation between tables, you cannot implement that if they reside in the same tablespace.
I do agree that having multiple containers for each tablespace on each node(MPP) is really advantageous - keep in mind that if the logical volumes presented to the database are striped then this introduces striping over striping and may hamper performance if the disk is not laid out correctly to avoid such contention.
Good point about using multiple containers on multiple disks if the disks are already striped (such as using Raid-5).
Stripping via multiple containers can have advantages by enabling DB2 to use parallelism even in a single node (non-MPP) environment, both for utilities and data intensive queries.
If most of the tables are less than 100MB, and a decent amount of memory is allocated to the buffer pools, then the placement is not quite as critical as might otherwise be the case. Very large tables might suggest there own tablespaces for performance and administrative purposes.