I have an application that is heavy in both reads and writes. This is running on ASE 12.5 in a RHEL AS 4.0 environment. It's connected to a RAID 10 array. Under normal load the server performs spectacular. However during periods of heavy updates/inserts once a minute the checkpoint process wakes up and flushes everything to disk. The checkpoint takes from 1 to 3 seconds according to sp_sysmon.
The application reading this DB has a 3 second time limit, so any IO's that are outstanding after 3 seconds are tossed.
I've tried every tunable I could find or think of and I'm hoping someone here might have a home grown solution.
I've tried increaseing the "housekeeper free write percent" all the way to 100%, however I never get a free checkpoint and the output of a 20 minute sysmon shows the housekeeper wasn't doing much (I should fire her).
What's really killing me is that other reads are blocked during a checkpoint, Is it supposed to work like this? Are checkpoints supposed to block other transactions, or is this just because checkpoint pegs the IO, starving other transactions?
Is there some way to say limit the checkpoint process to writing maybe 1 MB/s to disk and let other stuff continue to work? As it is checkpoint seems to consume everything it possibly can and nobody else gets to play while he's at the table.
Maybe changing "number of checkpoint tasks" might help
This parameter configures parallel checkpoints:
Parallel checkpoints depend on the layout of the databases and performance of underlying I/O sybsystems. Tune this parameter depending on the number of active databases and the ability of the I/O subsystem to handle writes.
We considered making this change as well, but figured if a single checkpoint process can saturate IO for a couple seconds having multiple checkpoints running probably would decrease that any. Anyone have any ideas if checkpoint is a full blocking process?
Hmm. odd. Is it actually showing blocked in sp_who/sysprocesses? OR did you just use the word "blocked" because of the i/o overhead its taking.
If its actually blocked,
a) any clue on sp_lock?
Also if its not actually blocked,
a) whats the no. of cpus you have
b) whats sp_monitor showing
c) whats the status of other spids
Only thing I can find is the following CR so far. Will check and see if theres anything else.
325003 In rare circumstances, sessions that access a given database may seem to hang while the transaction log of this database is undergoing truncation either by the 'CHECKPOINT SLEEP' task or through the execution of the DUMP TRANSACTION command.