Unanswered: Slave won't reconnect after VPN Failure
I am currently facing a problem where my slave node will not reconnect to the master after a VPN disconnect. The VPN provides connectivity from our slave to our master. We have 3 servers configured using Postgres 9.3.4 on Centos. The first 2 servers are co-located in the same network, with an high availability master/slave configuration being managed by a Linux-HA. The 3rd server is our problem child the resides on a different network and uses a VPN to establish connectivity to the master.
After successfully establishing replication (verified by querying pg_stat_replication), if the VPN gets disconnected and then reconnected the slave does not reconnect to the master. The slave postgresql needs to be restarted in order for it to re-establish connectivity
-Verify that WAL_SENDER_TIMEOUT and WAL_RECEIVER_TIMEOUT has been set to 0 (i.e. disabled) on all three servers
-I verified that the replication successfully recovers after a network stop. For example, I issued a service network stop, waited 10 minutes, issued a service network start and then the slave successfully connected to the master.
- the pg_log file does not show any message when the VPN is stopped or anything referencing it cannot connect to the master
Any help or additional troubleshooting steps are appreciated.