| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

10-23-08, 19:40
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
|
HADR Congestion
|
|
8.2 & 9.5 on AIX
Hey All,
I've been running ASYNC HADR now for almost a year and I've been gettin some slow down where the Primary DBs are getting congestion messages,
The only remedy I have at the moment is to turn off HADR for a bit then let it resync later, which is not ideal. It also shows as congested (from a db2pd -hadr) when it's in remote catchup, which strikes me as weird!
So I'm wondering 3 things...
Is the congested message during catchup real? It's getting the logs from TSM so there is no reliance on the primary
How can I determine the root cause of the congestion? It looks like memory and CPU are ok, and the network guys tell me the link is not maxed out....
So are there any strategies to tackle this?
I've increased the HADR buffer on the standby, but that just buys time... I've tried NEARSYNC but that had too much of an overhead....
|
|

10-24-08, 01:30
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
|
Originally Posted by meehange
It's getting the logs from TSM ...
How can I determine the root cause of the congestion?
|
Hmmm. Maybe the problem is...
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

10-24-08, 02:01
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
|
|
heheeh no, it happens with the live xfer of the log portions AS WELL AS when it's getting them from TSM during catchup
|
|

10-24-08, 15:34
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
For high volume transaction systems running HADR, I asked that they set up a private network between the two servers just to handle the HADR traffic.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

10-27-08, 00:14
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
Quote:
|
Originally Posted by Marcus_A
For high volume transaction systems running HADR, I asked that they set up a private network between the two servers just to handle the HADR traffic.
|
Well we were using a 10Mbit wireless connection to link the 2 sites (HADR for 16 systems of varying size/workload) and during problem determination we've switched to a 10Mbit fiber link.... still seeing similar amount of congestion... The network guys tell me that the link never reaches the full 10Mbits but I get the feeling that 10Mbit just isn't quite enough bandwidth for what we're doing... but all the options are expensive whether it be buy more CPU or RAM at the standby site (it's about 50% the spec of the Primary machine) or up the bandwidth... which is why I need to be able to figure out what the bottleneck is.
I mean it's be pretty bad to spend $50k on that stuff only to find out later it was because I'd poorly configured something 
|
|

10-27-08, 03:03
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
I would run some tests with and without HADR turned on to see what your application throughput is, and whether it is actually affected by the claimed "congestion". It is possible that there is nothing to worry about. I would also open an problem with IBM for a better explanation as to what the message means.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

10-27-08, 22:52
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
Quote:
|
Originally Posted by Marcus_A
I would run some tests with and without HADR turned on to see what your application throughput is, and whether it is actually affected by the claimed "congestion". It is possible that there is nothing to worry about. I would also open an problem with IBM for a better explanation as to what the message means.
|
There's definitely an effect on throughput (at least when HADR is peered) as we can see commit hangs on the primary which immediately disappear when HADR is turned off.
I'll try my hand at the PMR but previous HADR related PMR's I've raised don't seem to get much expert support, I reckon it's an area of DB2 where there just isn't that much practical experience...
|
|

10-28-08, 02:48
|
|
Registered User
|
|
Join Date: May 2003
Location: USA
Posts: 5,196
|
|
Quote:
|
Originally Posted by meehange
There's definitely an effect on throughput (at least when HADR is peered) as we can see commit hangs on the primary which immediately disappear when HADR is turned off.
I'll try my hand at the PMR but previous HADR related PMR's I've raised don't seem to get much expert support, I reckon it's an area of DB2 where there just isn't that much practical experience...
|
I don't agree with your comments about HADR and support, but you have to be persistent. If you just have a question, then that is not what they are really there for, but if there is a problem then they should help you. Anyway, there have been some APARs relating to congestion. But you should also have someone check out your network.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
|
|

10-30-08, 01:42
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
Quote:
|
Originally Posted by Marcus_A
I don't agree with your comments about HADR and support, but you have to be persistent. If you just have a question, then that is not what they are really there for, but if there is a problem then they should help you. Anyway, there have been some APARs relating to congestion. But you should also have someone check out your network.
|
I should probably point out that I'm in Asia-Pacific, so I think perhaps the support team here isn't as large or as battle-hardened as their US colleagues
I am having the externals checked, like network etc. but I'm trying to make sure I've done all that I can to make sure the problem isn't in my own yard....
|
|

11-13-08, 16:03
|
|
Registered User
|
|
Join Date: Nov 2008
Posts: 1
|
|
You may want to check out the HADR wiki area on developerWorks, and in particular this page: http://www.ibm.com/developerworks/wi...data/HADR_tune
It helps explain "congestion" (there are several potential causes) and gives some suggestions on how to make sure your comms setup is well tuned for HADR use.
Regards,
- Steve P.
--
Steve Pearson, DB2 for Linux, UNIX, and Windows, IBM Software Group
"Portland" Development Team, IBM Beaverton Lab, Beaverton, OR, USA
|
|

11-17-08, 23:06
|
|
Registered User
|
|
Join Date: Jul 2004
Posts: 256
|
|
That's ace Steve... thanks!
Quote:
|
Originally Posted by stevep222
You may want to check out the HADR wiki area on developerWorks, and in particular this page: http://www.ibm.com/developerworks/wi...data/HADR_tune
It helps explain "congestion" (there are several potential causes) and gives some suggestions on how to make sure your comms setup is well tuned for HADR use.
Regards,
- Steve P.
--
Steve Pearson, DB2 for Linux, UNIX, and Windows, IBM Software Group
"Portland" Development Team, IBM Beaverton Lab, Beaverton, OR, USA
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|