Unanswered: Working with LOB data - direct reads - High no victim buffers available
I need some help interpreting a buffer pool snapshot.
I have a "new to me" OLTP environ. I been given to tune, it's legacy Websphere Commerce Server. I start with looking at the bufferpools. I turned on the BUFFERPOOL monitor this morning and just took a snapshot. I noticed a high number of no victim buffers available and direct reads. The weird thing to me is the data/async pages writes are low and roughly the same value. Also there are lots of reads happening and few writes. To me it's seems the BP is too small but the page cleaners are okay, and the high no victim buffers are a result of grabbing LOB data. I don't have a lot of experience working with LOB data. Does this snapshot seem alarming? Sorry if I am all over the place in my diagnosis.
Note currently there is only 1 LUN holding all tablespace data. /db2/data/....
Below I pasted a section of the snapshot for Bufferpool 16K:
DB21085I This instance or install (instance name, where applicable:
"db2wcspi") uses "64" bits and DB2 code release "SQL09079" with level
Informational tokens are "DB2 v188.8.131.52", "s140512", "AIP23561", and Fix Pack
Product is installed at "/opt/ibm/db2/V9.7".
snapshot of a 16k bufferpool:
Buffer pool data logical reads = 25803933
Buffer pool data physical reads = 137953
Buffer pool temporary data logical reads = 0
Buffer pool temporary data physical reads = 0
Buffer pool data writes = 3006
Buffer pool index logical reads = 604042384
Buffer pool index physical reads = 43508
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool xda logical reads = 0
Buffer pool xda physical reads = 0
Buffer pool temporary xda logical reads = 0
Buffer pool temporary xda physical reads = 0
Buffer pool xda writes = 0
Total buffer pool read time (milliseconds) = 3585
Total buffer pool write time (milliseconds)= 11585
Asynchronous pool data page reads = 24929
Asynchronous pool data page writes = 2963
Buffer pool index writes = 2795
Asynchronous pool index page reads = 15619
Asynchronous pool index page writes = 2745
Asynchronous pool xda page reads = 0
Asynchronous pool xda page writes = 0
Total elapsed asynchronous read time = 855
Total elapsed asynchronous write time = 11385
Asynchronous data read requests = 1329
Asynchronous index read requests = 792
Asynchronous xda read requests = 0
No victim buffers available = 181587
Direct reads = 402432
Direct writes = 19782
Direct read requests = 168447
Direct write requests = 9384
Direct reads elapsed time (ms) = 2805
Direct write elapsed time (ms) = 9843
Database files closed = 0
Unread prefetch pages = 3466
Vectored IOs = 3540
Pages from vectored IOs = 60104
Block IOs = 0
Pages from block IOs = 0
Node number = 0
Tablespaces using bufferpool = 2
Alter bufferpool information:
Pages left to remove = 0
Current size = 106727
Post-alter size = 106727
Find out what pct of the LOB / CLOB / BLOB columns are inlined, and the average length of the large column values.
Lobs won't go via the bufferpools (hence direct reads/writes instead), unless the actual value fits in the inline-length.
If none or few are inlined, OR if the average-length of the large columns greatly exceeds the inline-length, then you will experience direct reads/writes when code accesses these lobs.
If that is significant, ensure the tables concerned have a dedicated large tablespace for the lobs distinct from the data tablespace and index-tablespace, and consider enable file-system caching only for that lob-specific tablespace.
Your task is tuning - so you need a structured approach - don't focus on a single snapshot - first define the problems.
Direct reads/writes are commonly associated with lobs that are not inlined, but there are other cases when direct reads are used.
Is STMM enabled? Are the bufferpools resizing automatically? Are bufferpools large enough?
How much RAM is available? What settings for instance_memory and database_memory?
Are other instances/databases sharing the same lpar?
What page cleaning settings are in effect? (chngpgs_thresh, softmax, db2_use_alternate_page_cleaning, num_iocleaners). If *not* using proactive cleaning you can ignore the no victim buffers.
Use different approaches to find bottlenecks.
Consider checking db2top for quick observations.
Consider studing outputs of the monreport procedures (such as monreport.dbsummary, monreport.currentsql etc), when run for an interval at the times when performance is questioned.
Consider checking that Websphere-Commerce best practices for db2 are followed, in relation to performance tuning.
Yes STMM is on and the buffer pools, instance and database memory are all resizing automatically. However I have seen in the past where STMM is not the best at resizing the buffer pools. There is 254GB MEM available. There is only 1 instance on this machine.
They are not using alternative page cleaning and the tablespaces are still REGULAR. I think I can start there in terms of applying the "latest" IBM recommendations as per v8 .
The db2top shows a good hit rate but the victims pages can go up to 9K, whack seemed alarming to me.
I have a mac right now and can't perform my usual perf tuning i.e. visual explain on sql queries. You got it right, I can't really do to much until I see a significant load on the server. They have all the monitor switches turned off so I have asked that the buffer pool monitor be set to ON on the dbm.
The high number of victim buffers indicates there is an performance issue with the cleaning of the bufferpools. As you are on 9.7, you need to tune SOFTMAX and CHNGPGS_THRESH to ensure dirty pages are flushed from the bufferpool quick enough to ease contention.
As for lobs. If the lob is inlined, then the inlined portion of the lob will be read into the bufferpool. If the lobs are not inlined then the lob will be read directly from disk as required.