Re: postgres hot-standby questions.

John Scalia <jayknowsunix@xxxxxxxxx> · Thu, 26 Mar 2015 11:43:59 -0400



On 3/26/2015 10:17 AM, Graeme B. Bell wrote:
> Hello everyone, 
>
> Two questions, grateful for any feedback anyone can share. They relate to switchover between master and hot-standby.
>
>
> 1. What exactly is the behaviour of the master/primary during shutdown when it has a hot standby?
>
> https://wiki.postgresql.org/wiki/Hot_Standby
> http://www.postgresql.org/docs/current/static/hot-standby.html
>
> If I shut down the master, then afterwards when it is finished, I shut down the standby, will they contain identical logical databases & WAL records, e.g. assuming possible network failure?
>
> i.e.
> - Is the primary shutdown delayed until it has received notice from the standby that all WAL has been received?    ( my guess is: no)
> - Is the primary shutdown delayed until it has received notice from the standby that all WAL has been applied?    ( my guess is: no)
> - Can the primary generate any new WAL (e.g. checkpoint) or logical data changes during the shutdown process that might not be sent to the standby?   (my guess is: no)
>
> For example, https://vibhorkumar.wordpress.com/2014/06/30/switchoverswitchback-in-postgresql-9-3/
> "With this patch, the walsender process tries to send all outstanding WAL records to the standby in replication when the user shuts down the master."
> "tries"?
>
> That page also makes it seem like you have to manually check the WAL status. (section 2)
> Is there any way to make the primary's completion of shutdown automatically synchronous with completion of WAL on standby(s)?
In synchronous streaming replication, the master will wait for all pending transactions to complete before stopping. It will not accept new connections while a shutdown is pending.
>
> 2. Let's assume for the moment I have some crazy reason to prefer to avoid rsync where possible, such as its historical weird behaviour on HFS+ filesystems or its present unreliable heuristic for syncing hard-links or its default behaviour of not using checksums. Can a controlled switchover (not failover) be built on wal_keep_segments alone? 
>
> wal_keep_segments could be set to a fairly high number, maybe 10000 (160GB) to allow standby catchup even after a day. Let's assume here than 99% of maintenance takes less than an hour, and that we're keeping rsync as a fallback for the worst case.
>
> e.g. here's the whole switchover process...  
>
> shutdown A (Master)
> shutdown B (Standby)
> (A and B  should be identical in terms of WAL and logical data at this point).
> swap M/S configurations around
> start B (Master)
>
> do some work on A for an hour
> finish work on A
>
> start A (Standby)
> A catches up with B from wal_keep_segments.
>
>
> shutdown B (Master)
> shutdown A (Standby)
> (A and B  should be identical in terms of WAL and logical data at this point).
> swap M/S configurations around
> start A (Master)
> start B (Standby)
If you want then to promote the standby to master, you need as others have noted, a trigger file placed on the standby. At that point, your new primary will absolutely be up to
date with all transactions. That's the main reason I really like synchronous replication over asynchronous. The problems only arise when your standby fails as the current primary
cannot commit a transaction and is thus somewhat unresponsive. A second hot standby will solve that problem providing both hot standbys do not go offline simultaneously.
>


-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin