| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

08-16-11, 07:39
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
|
that's how they do it
|
|
Always educational reading TPC-C - Result Details
See how the goeroes at IBM configure their database trying to win such a contest.
( always wanted to use this word in english, glad I do not have to pronounce it  )
PECULIAR:
- IBM proudly introduced table-partitioning in V9. They still use the good-old union all view technique themselves, I wonder why
- They have a 1 to 5 ratio (1 bufferpool serves 5 tablespaces). I wonder why
- STMM.... no STMM. I wonder why, is Scott Hayes right afterall or is it the large number of bufferpools?
Are their more things that caught your attention?
|
|

08-16-11, 07:56
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
Originally Posted by dr_te_z
( always wanted to use this word in english, glad I do not have to pronounce it  )
|
which word? goeroes?
it's pronounced exactly as spelled 
|
|

08-16-11, 08:02
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
|
|
Quote:
Originally Posted by r937
which word? goeroes?
it's pronounced exactly as spelled 
|
Peculiar. That's the word.
|
|

08-16-11, 08:12
|
|
Registered User
|
|
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
|
|
Quote:
Originally Posted by dr_te_z
- STMM.... no STMM. I wonder why, is Scott Hayes right afterall or is it the large number of bufferpools?
|
If you know exactly the workload and have tuned your system to fit that workload, why would you want to use STMM? It adds overhead (even if very small), which you don't want to have in such a situation.
p.s: I don't know the rationale for the other decisions, so I can't comment.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
|
|

08-16-11, 13:15
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
Originally Posted by dr_te_z
Always educational reading TPC-C - Result Details
See how the goeroes at IBM configure their database trying to win such a contest.
( always wanted to use this word in english, glad I do not have to pronounce it  )
PECULIAR:
- IBM proudly introduced table-partitioning in V9. They still use the good-old union all view technique themselves, I wonder why
- They have a 1 to 5 ratio (1 bufferpool serves 5 tablespaces). I wonder why
- STMM.... no STMM. I wonder why, is Scott Hayes right afterall or is it the large number of bufferpools?
Are their more things that caught your attention?
|
Most of the tables (8 of the 10 tables that make up the TPC-C schema) are horizontally spread accross 100 different tables by IBM based on the PK range, and organized by Key Sequence for maximum performance. This means there are 100 different Customer tables, etc (Cusotmer0, Customer1, Customer2, ... Customer99). Even though there is a UNION ALL view on them, I believe the application logic may have some intellegence built into to determine which table has the data based on the key (but not 100% sure about that). The design used by IBM in this TPC_C benchmark assumes that one knows in advance all the key values in advance (or almost all of them), and they are not signficantly increased over time since all the 100 customer tables (and the 7 others that split into 100 tables) are created and populated in advance with all the key values and ranges (a flaw in the TPC-C benchmark IMO). Using a design like this is probably not practical for most of our applications.
Table partitioning has adminstrative advantages such as being able to detach old partitions, but this data is partitioned by key range (not date) and there is no clean-up of old data required as part of TPC-C.
I can't believe that anyone thought that STMM is actually more efficient than manually tuning by someone who knows what they are doing. STMM is just marketing hype to deal with the criticism that DB2 requires significant DBA resources that are hard to find.
The other interesting thing about this benchmark compared to others previously done by IBM is use of massive number of SSD storage devices for data.
This particular benchmark does not set any total throughput records, but does set a record in cost/transcation when total tpmC is at least 2 million.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

08-17-11, 02:40
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
Quote:
Originally Posted by Marcus_A
I believe the application logic may have some intellegence built into to determine which table has the data based on the key (but not 100% sure about that).
|
Right, some kind of "do it yourself" partition elimination.
A normal application would never be coded like that so we (simple mortals) should go on using data-partitioning.
How about the huge number of buffer-pools? That does not strike anybody as odd? Or do you do that all the time: every tablespace it's own dedicated bufferpool?
|
|

08-17-11, 03:25
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
Originally Posted by dr_te_z
Right, some kind of "do it yourself" partition elimination.
A normal application would never be coded like that so we (simple mortals) should go on using data-partitioning.
How about the huge number of buffer-pools? That does not strike anybody as odd? Or do you do that all the time: every tablespace it's own dedicated bufferpool?
|
The bufferpools are related to the "partitioning" of the tables, so that there are 100 customer tables (each with a tablespace), and also 100 tables for WAREHOUSE, DISTRICT, STOCK, CUSTOMER, HISTORY, ORDERS, ORDERLINE, and NEWORDER which is 8 out the 10 entities in the data model).
The issue is not whether they manually partitioned the data (instead of using Table Partitioning), it is that they partitioned the data by PK key range (which does not include date). In a normal database, one does not know all the key ranges in advance to partition the data this way, but there is a flaw in the TPC-C benchmark that allows IBM (and also Oracle has done this) to partition the data manually by range in advance for 8 of the 10 tables in the TPC-C schema. If there were only 10 physical tables in the schema, instead of 802 tables the way IBM did it (8 x 100 + 2 not partitioned), there wouldn't be so many bufferpools.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

08-17-11, 04:57
|
|
Registered User
|
|
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
|
|
For the buffer pools, you also want to avoid problems if 1 table/partition is more heavily hit. That could cause pages of other tables to be evicted while a separate buffer pool would keep those pages in memory. For things like that, you'd want to have many (smaller) buffer pools.
Basically, such a decision depends on the internal knowledge of the BP's eviction strategy.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
|
|

08-17-11, 04:59
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
Quote:
Originally Posted by Marcus_A
802 tables the way IBM did it (8 x 100 + 2 not partitioned), there wouldn't be so many bufferpools.
|
I understand.
But I wonder why they did not define 1 big bufferpool shared by all tablespaces because they all have the same characteristics. Now they define a new/fresh bufferpool for each 6 to 8 tablespaces. It seems to me that 10 bufferpools of 100 pages each is more efficient than 1 bufferpool with the size of 1000 pages. That is the impression I get, but I've never heard if that is the case or not.
|
|

08-17-11, 05:30
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
Originally Posted by dr_te_z
I understand.
But I wonder why they did not define 1 big bufferpool shared by all tablespaces because they all have the same characteristics. Now they define a new/fresh bufferpool for each 6 to 8 tablespaces. It seems to me that 10 bufferpools of 100 pages each is more efficient than 1 bufferpool with the size of 1000 pages. That is the impression I get, but I've never heard if that is the case or not.
|
10 bufferpools of 100 pages each is NOT more efficient than 1 bufferpool with the size of 1000 pages. However, in this particular benchmark, all the data is created in advance and access is randomly distributed across all the data at extremely high transaction rates, something that is not likely in a real application.
Whether there are performance issues with really large bufferpools, I have heard that is not a problem with DB2, although it is a problem with Oracle. But I can't say for sure whether multiple bufferpools might perform very slightly better when they exceed a really large size. However, keep in mind that even if that were the case, lots of smaller bufferpools tend to have poorer memory utilization than fewer large bufferpools, which would usually offset any issues with addressing large memory pools. However, as stated above, in the artificial TPC-C benchmark, memory utilization of smaller bufferpools is not an issue as it would be with a normal database/application.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

08-17-11, 08:56
|
|
Registered User
|
|
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
|
|
Thank you guys. Bottom line: those benchmark reports are interesting reads, but not make the mistake to treat them as your example on how-to configure your server. Too much ... how shall I call it ... non-agile stuff in there 
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|