I searched around on the internet for that trick and it looks like we can make the Standby close its connection to the master much earlier than it otherwise would;it is good for me now.
But still there seems to be two problem areas that can be improved over time...
- although both master(with replication_timeout) and slave (with tcp timeout option in primary_conninfo parameter) closes the connection in quick time (based on tcp idle connection timeout), as of now they do not log such information. It would be really helpful if such disconnects are logged with appropriate severity so that the problem can identified early and help in keeping track of patterns and history of such issues.
- Presently, neither master nor standby server attempts to resume streaming replication when they happen to see each other after some prolonged disconnect. It would be better if either master or slave or both the servers makes periodic checks to find if the other is reachable and resume the replication( if possible, or else log the message that a full sync may be required).
Thanks and Regards,
Samba
Samba
----------------------------------------------------------------------------------------------------------------------
On Fri, Nov 4, 2011 at 7:25 AM, Fujii Masao <masao.fujii@xxxxxxxxx> wrote:
On Thu, Nov 3, 2011 at 12:25 AM, Samba <saasira@xxxxxxxxx> wrote:No.
> The postgres manual explains the "replication_timeout" to be used to
>
> "Terminate replication connections that are inactive longer than the
> specified number of milliseconds. This is useful for the primary server to
> detect a standby crash or network outage"
>
> Is there a similar configuration parameter that helps the WAL receiver
> processes to terminate the idle connections on the standby servers?
But setting keepalive libpq parameters in primary_conninfo might be useful
to detect the termination of connection from the standby server.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center