If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Cache replacement policy considered harmful (for union statements)

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-02-06, 14:41
pjcarreira pjcarreira is offline
Registered User
 
Join Date: Feb 2004
Location: Lisbon, Portugal
Posts: 5
Cache replacement policy considered harmful (for union statements)

Folks, it looks like DB2 (UDB 8.2.2/Linux) cache replacement policy is not "good" when multiple performing full table scans are involved.

Here's what I'm trying to do: I'm transforming columns into rows of a source table loanevt(loanno char(7),evtyp char(6),captl decimal(9,2),tax decimal(5,2),expns decimal(5,2),bonus decimal(5,2)) into a target table named events using the following query:

insert into events (loanno, evtyp, amtyp, amt)
select loanno, evtyp, 'CAPTL' as AMTYP, captl
from loanevt
where captl > 0
union all
select loanno, evtyp, 'TAX' as AMTYP, tax
from loanevt
where tax > 0
union all
select loanno, evtyp, 'EXPNS' as AMTYP, expns
from loanevt
where expns > 0
union all
select loanno, evtyp, 'BONUS' as AMTYP, bonus
from loanevt
where bonus > 0

The issue here is that this statement scans the loanevt table multiple times. However, I was expecting that as I increase the size of the buffer pool I would observe gradually lesser Phys. Reads. But this does not happen. Instead, the number of physical reads only drops (abruptly) when the input table can be totally fitted in the buffer pool.

I'm testing this with 5M records, so loanevt ends up with 26640 blocks (8K) read through TBSCAN.

With a buffer pool of 25600 blocks, I get this:
Buffer pool data logical reads = 110490
Buffer pool data physical reads = 104795
...
Total execution time (sec.ms) = 246.635994

With a buffer pool of 27000 blocks, I get this:
Buffer pool data logical reads = 110490
Buffer pool data physical reads = 26640
...
Total execution time (sec.ms) = 80.343795

It looks to me that DB2 is reading the scanned pages and marking them as MRU, this appears ok, I think this is how it should work. When it starts scanning the relation again to build up the result of the union, it will need the first block. Now, even if the buffer is one block smaller than the input relation, when the last block of loanevt is read-in, by age, block #1 is select as a victim. When block #1 of loanevt is read-in, then block #2 is the victim, when block #2 is read-in block #3 is the victim and on it goes.

I'm still making sense?

Anyway I could change this? I cannot afford a 1GB readpool to treat a real-world version of loanevt.
Any pointers would be greatly apreciated.

Thanks.
Reply With Quote
  #2 (permalink)  
Old 06-02-06, 14:56
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
It looks to me like you are conducting some sort of test of the bufferpool LRU algorithm and not really trying to address a real-world problem with a SQL statement (maybe the "PhD" is a hint). If the query was a real-world problem, you could eliminate the UNION ALL and put all the predicates in the same query and only scan the table one time. UNION ALL is valuable when each individual query can be optimized to a different index, but since each of your UNION ALL queries do table scans, then it seems like a poor way to do it.

The way you described the LRU algorithm, while maybe not optimal in the case of your specific query, it does seem to be working as designed. In most systems there are multiple users at the same time and referencing different tables, so your scenario seems a bit artificial to me.

In general, bufferpools are not optimized for queries which do complete table scans on large tables. That is why large tables should usually be put in a separate bufferpool, so that the indexes and small tables can stay resident in their own bufferpool without getting flushed out by a table scan on a large table.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #3 (permalink)  
Old 06-02-06, 18:20
pjcarreira pjcarreira is offline
Registered User
 
Join Date: Feb 2004
Location: Lisbon, Portugal
Posts: 5
Marcus, the query I have presented is a simplifyed version of several rw probs. I have ran into. I don't see how I could eliminate the union all as you sugest. Note that, if I pull all the predicates to the same query, then each input record will result in atmost one output record. Herein, our query needs to produce atmost 4 output records for each input record, hence, the union all.

As I see it, more users accessing the different tables would worsen the problem even more if this load competes for the same cache as the union all. But, true, the LRU algo appears to be working as designed and though I would find a most exciting task messing arround with the LRU algo, I'm afraid I'm not that lucky yet. You can check out topic on my home page. -- BTW, perhaps I should consider ajusting my signature :-).

Indeed, large table scans hinder the effectiveness of buffer pool caches. If the tables unioned are different little can be done. Howewer I would expect the plan generator to do better for a simple case like T union T.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On