Re: PostgreSQL Synchronous Replication in production

Merlin Moncure <mmoncure@xxxxxxxxx> · Fri, 7 Jun 2013 10:20:43 -0500

On Fri, Jun 7, 2013 at 3:22 AM, Albe Laurenz <laurenz.albe@xxxxxxxxxx> wrote:
> Colin Sloss wrote:
>> I have been testing the differences between asynchronous and synchronous hot standby streaming
>> replication on PostgreSQL 9.2.4. There is some push towards synchronous replication, but I am finding
>> some serious problems, and wonder how other people deal with them.
>>
>> Action:
>> The Slave is unreachable (postgres is stopped, or machine is turned off)
>>
>> Result:
>> Transactions stay open, and add up until connection is allowed between Master and Slave again.
>>
>> My guess:
>> Create a script that detects when transactions are being opened, but not committed on slave. Alter
>> postgresql.conf to comment out synchronous_standby_names and pg_ctl reload. This should boot the
>> server back to asynchronous, more or less.
>>
>> Caveats:
>> I haven't tested this yet, and I'm not sure what would happen to the transactions that started while
>> synchronous replication was active. Any guesses?
>>
>> The whole idea of my solution was to have no single point of failure. This seems to create two
>> exclusive points of failure, each needing a completely separate reaction. My original proposal was
>> asynchronous replication, with xlogs being stored on shared storage, and DRBD replication to prevent
>> it from being a single point of failure. I have never seen it go over 15kbs behind in my setup, which
>> still results in a very speedy failover. Is it just me, or is that seeming better than just
>> synchronous replication?
>>
>> Another caveat I found is that setting up slaves becomes more complicated. You have to set up the
>> Master in asynchronous style, and then switch it to synchronous only when the timing is right.
>> Otherwise the transactions will sit there until everything is ready.
>>
>> Sorry for the onslaught of questions, I don't expect all of them satisfied. Please share any
>> resolutions to these issues which you guys have discovered.
>
> One simple guideline first:
> If you use synchronous replication, you'll have to have at least two standby
> servers or the overall availability of your system will suffer.
>
> The differences between synchronous and asynchronous replication are mostly:
> a) With synchronous replication you cannot lose a committed transaction
>    during failover.
> b) Synchronous replication will slow down your system; the higher the
>    network latency between the servers, the slower it will get.
>
> You should work on getting the requirements defined:
> - In the case of failover, can you afford to lose a few committed transactions?
> - Are you ready to pay the price for synchronous replication
>   (second standby if you don't want availability to suffer, slower
>   database system, more complicated setup and failover procedures)?
>
> For synchronous replication, keep the servers close together with
> a strong network inbetween.
> To protect against catastrophes (if that's a requirement), you should
> use another asynchronous standby in a distant location.

yeah -- well put.  Synchronous replication exists because in some
cases even the loss of a single transaction is unacceptable.  So it's
not really fair to compare vs asynchronous which is more of a 'best
effort' system; it in no way guarantees that every transaction is
recoverable.  That small fudge allows for big optimizations in terms
of process simplicity and performance.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general