Thanks Marcus, unfortunately the db2diag.log files are not very helpful.
Primary log file after 60 second timeout
2009-03-19-12.42.57.525017+120 I2619992E513 LEVEL: Error
PID : 4052 TID : 47398647731840PROC : db2hadrp (BPRP) 0
INSTANCE: db2inst1 NODE : 000 DB : BPRP
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20200
MESSAGE : Did not receive anything through HADR connection for the duration of
HADR_TIMEOUT. Closing connection.
DATA #1 : Hexdump, 4 bytes
0x00007FFFD38B887C : 3D00 0000 =...
2009-03-19-12.42.57.525227+120 I2620506E338 LEVEL: Severe
PID : 4052 TID : 47398647731840PROC : db2hadrp (BPRP) 0
INSTANCE: db2inst1 NODE : 000 DB : BPRP
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20200
RETCODE : ZRC=0x00000000=0=PSM_OK "Unknown"
2009-03-19-12.42.57.525308+120 E2620845E354 LEVEL: Event
PID : 4052 TID : 47398647731840PROC : db2hadrp (BPRP) 0
INSTANCE: db2inst1 NODE : 000 DB : BPRP
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10000
CHANGE : HADR state set to P-RemoteCatchupPending (was P-Peer)
Status in the snapshot shows : Disconnected
Nothing appears in the standby db2diag.log until the database is deactivated.
Secondary log file after deactivate of database
2009-03-19-13.13.04.353260+120 I19685252E394 LEVEL: Warning
PID : 19375 TID : 47600804660864PROC : db2agent (BPRP) 0
INSTANCE: db2insta NODE : 000 DB : BPRP
APPHDL : 0-1632 APPID: *LOCAL.db2insta.090319111304
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduStartup, probe:21151
MESSAGE : Info: HADR Startup has begun.
2009-03-19-13.13.04.360058+120 I19685647E413 LEVEL: Error
PID : 2123 TID : 47600804660864PROC : db2redom (BPRP) 0
INSTANCE: db2insta NODE : 000 DB : BPRP
APPHDL : 0-1653
FUNCTION: DB2 UDB, recovery manager, sqlpshrScanNext, probe:1450
RETCODE : ZRC=0x80100003=-2146435069=SQLP_LINT "Interrupt from application"
DIA8003C The interrupt has been received.
2009-03-19-13.13.04.360729+120 I19686061E413 LEVEL: Error
PID : 2123 TID : 47600804660864PROC : db2redom (BPRP) 0
INSTANCE: db2insta NODE : 000 DB : BPRP
APPHDL : 0-1653
FUNCTION: DB2 UDB, recovery manager, sqlpPRecReadLog, probe:1275
RETCODE : ZRC=0x80100003=-2146435069=SQLP_LINT "Interrupt from application"
DIA8003C The interrupt has been received.
2009-03-19-13.13.06.272529+120 I19686475E413 LEVEL: Error
PID : 2123 TID : 47600804660864PROC : db2redom (BPRP) 0
INSTANCE: db2insta NODE : 000 DB : BPRP
APPHDL : 0-1653
FUNCTION: DB2 UDB, recovery manager, sqlpPRecReadLog, probe:1280
RETCODE : ZRC=0x80100003=-2146435069=SQLP_LINT "Interrupt from application"
DIA8003C The interrupt has been received.
Nothing else get written to the log file until I do the db2_kill. I have in the past waited longer than 10 minutes for my deactivate command to respond.
I will try setting the mode to near-sync before our change freeze for month-end and see what happens. During this period the hadr heartbeat breaks almost everyday on the busy database. The less busy databases are not affected.