Quote:
|
Originally Posted by istikhar
Thanks for your reply.
Can you please provide some details what sort of issues were there while using Tivoli for for HADR.
Whether automatic switching was not working?
Any configuration issue?
etc?
Please comment
Thanks
Istikhar
istikhar@gmaila.com
|
One of the biggest issues is that TSA is not idiot proof, which is a problem IMO for software that is designed to manage a HA cluster.
For example, with TSA you cannot issue normal DB2 commands to start and stop a DB2 instance (db2start, db2stop), or to do an HADR takeover (TAKEOVER HADR ON DATABASE
database-alias), you must issue special TSA commands (or invoke HA scripts that come with DB2) instead of the normal DB2 commands for these functions. If you do issue the regular DB2 commands by mistake (which someone may do by instinct in a crisis situation), then your HADR cluster may become hosed if it is under the control of TSA.
For any software of this type, one of the main problems is trying to determine when a HADR takeover by force should be issued by the cluster manager if the primary server goes down. Unfortunately, this is not done in TSA by actually trying to connect to any databases, rather it is done by monitoring certain DB2 processes and certain resources (like network connections). This is not foolproof IMO. If an instance crashes, TSA will restart the Db2 instance and not do a takeover (if you use the supplied HA scripts).
One hint that I will offer is that if you have a public network (for application tier access to the database server) and a private network (for HADR replication between the primary and standby servers) then you need to make the public network a “critical resource.” Otherwise TSA may assume that since it can reach the primary server (from the standby server) via the private network, that everything is OK even though the public network to the primary server is down and the applications cannot reach the database.
The other suggestion is that you use the V9 DB2 HA scripts and V9 clients (type 2 or type 4). The client code was upgraded, and the HA scripts that come with V9 have been improved, even though there were no real changes to HADR itself in the V9 DB2 server code. You could use the V9 HA scripts with Version 8 server, and use the V9 clients with the V8 server, but these combinations are probably not supported by IBM.
The "automatic switching" a dual issue in that you want an automatic takeover to the standby to occur when the primary is not available, and you want the applications to perform automatic client reroute when the takeover occurs. One of the problems with V8 clients (especially the V8 type 4 clients) is that there is a TCP/IP timeout issue as to how long the client should wait on a hung connection (which would occur if the primary server crashed) before it gives up and tries the standby. This is not an issue with a planned takeover (not by force) when both servers are up and running, as would happen when you did a takeover to do maintenance on the primary server.
If you never make a mistake when issuing TSA commands, and use the latest version of DB2 clients and server (V9) then I would say TSA may work OK for you.