Re: synchronous replication: blocking commit on the master

Fujii Masao <masao.fujii@xxxxxxxxx> · Wed, 29 Feb 2012 13:02:55 +0900

On Wed, Feb 29, 2012 at 3:22 AM, Jameison Martin <jameisonb@xxxxxxxxx> wrote:
> i don't think i've explained things very clearly. the implied contradiction
> is that i'd be using asynchronous replication to catch up a slave after a
> slave failure and thus i'm losing the transactional consistency that i
> suggest i need.  if a slave fails and is brought back on line i am indeed
> proposing that it catch up with the master asynchronously; however,  the
> slave wouldn't be promoted to a hot standby until it is completely caught up
> and could be reestablished as a synchronous replica (at least that is what
> i'd like to do in theory). so i'm proposing that a slave would never be a
> candidate for a HA failover unless it is completely in sync with a master:
> if there is no slave that is in sync with the master at the time the master
> fails, then the master would have to be recovered from the filesystem via
> traditional recovery. the fact that i envision 'catching up' a slave to a
> master using asychronous replication is not particularly relevant to the
> transactional guarantees of the system as a whole if the slave is
> effectively unavailable while catching up.
>
> similarly, any slave that isn't caught up to its master would also not be
> eligible for queries.
>
> i can understand why the master might hang when there is no reachable
> replica during synchronous commit, this is exactly the right thing to do if
> you want to guarantee that you have at least 2 distinct spheres of
> durability. but i'd prefer to sacrifice the extra durability guarantee in
> favor of availability in this case given that recovery from the file system
> is still an option should the master subsequently fail. my availability
> issue is that the master would clearly be hung/unavailable for an unbounded
> amount of time without a strong guarantee about the time it might take to
> bring a replica back up which is not acceptable in my case.
>
> if the master hangs commits because there is no active slave, i believe that
> an administrator would have to
>
> detect that there are no active slaves
> shut the master down
> disable synchronous replication
> bring the master back up

You don't need to restart the server when you disable sync replication.
You can do that by emptying synchronous_standby_names in postgresql.conf
and reloading it (i.e., pg_ctl reload).

BTW, though you can disable sync replication by setting synchronous_commit
to local in postgresql.conf, you should use synchronous_standby_names for
that purpose instead. Setting synchronous_commit to local can prevent new
transactions (which are executed after setting synchronous_commit to local)
from being blocked, but cannot resume the already-blocking transactions.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general