I have also verified that the switch over happens as expected after rebooting active node. I have observed that IP gets switch over from active node to passive node immediately after rebooting active node, but db2 goes in pending state and takes around 3 to 4 mins to become online.
Is that expected to switch over from active node to passive node? If no, what might have gone wrong? How should I fix this?
I am using shared filesystem. So my logs are also shared. db2diag.log file is getting written by active node.
I have shared a partition called /Db2 on both machines. This has db2inst1 user home and also installtion is done in this directory. # ls /DB2/
drwxr-xr-x 9 dasusr1 dasadm1 65536 2010-03-29 21:11 dasusr1
drwxr-xr-x 8 db2fenc1 db2fadm1 65536 2010-03-29 21:07 db2fenc1
drwxr-xr-x 15 db2inst1 db2iadm1 65536 2010-04-05 17:04 db2inst1
drwxr-xr-x 38 root root 65536 2010-03-29 21:10 v9.7
Do you think this can be an issue?
I will try rebooting active node and check for errors.
I agree with sathyaram 3-4 minutes is reasonable, the longer recovery time has to do with the loss of the group buffer pools when the disaster event occurs, something that causes database objects to go into group buffer pool recover pending status. It takes a while to get the objects out of pending status when the data sharing group is recovered at the DR site.
one of solution will be : Requires the standby system to always be fully up-to-date with the primary system, which can bring down the online time.