Re: Postgres HA

Tatsuo Ishii <ishii@xxxxxxxxxxxx> · Sat, 06 Jan 2018 11:09:31 +0900 (JST)

Hi,

Yes, definitely I am hanging out here.

If you have more specific questions to Pgpool-II, you are encouraged
to be subscribed to the Pgpool-II mailing list.
https://www.pgpool.net/mailman/listinfo/pgpool-general

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> What he said, and you also may want to look at pgpool-II. I’ve had fairly good luck with that and Tatsuo (the author) hangs out here occasionally too.
> ―
> Jay
> 
> Sent from my iPad
> 
>> On Jan 5, 2018, at 4:00 PM, Jehan-Guillaume (ioguix) de Rorthais <ioguix@xxxxxxx> wrote:
>> 
>> On Fri, 5 Jan 2018 13:07:10 -0600
>> Azimuddin Mohammed <azimeiu@xxxxxxxxx> wrote:
>> 
>>> Hello,
>>> I am little confused with how HA works in postgres. Reading the article
>>> which state as below "*If the primary server fails and the standby server
>>> becomes the new primary, and then the old primary restarts, you must have a
>>> mechanism for informing the old primary that it is no longer the primary.
>>> This is sometimes known as STONITH (Shoot The Other Node In The Head),
>>> which is necessary to avoid situations where both systems think they are
>>> the primary, which will lead to confusion and ultimately data loss.*
>>> 
>>> *Many failover systems use just two systems, the primary and the standby,
>>> connected by some kind of heartbeat mechanism to continually verify the
>>> connectivity between the two and the viability of the primary. It is also
>>> possible to use a third system (called a witness server) to prevent some
>>> cases of inappropriate failover, but the additional complexity might not be
>>> worthwhile unless it is set up with sufficient care and rigorous testing.*
>>> *PostgreSQL does not provide the system software required to identify a
>>> failure on the primary and notify the standby database server. Many such
>>> tools exist and are well integrated with the operating system facilities
>>> required for successful failover, such as IP address migration."*
>>> 
>>> Can someone explain how the HA failback will take place
>> 
>> The failback need either to rebuild the old master as a standby (rsync,
>> pg_basebackup, restore PITR, ...) or to use pg_rewind to rewind the old master
>> to a point where it can catch up with the new master.
>> 
>> Some tools tries to automate failback using pg_rewind (patroni, repmgr), but I
>> have no experience with them.
>> 
>>> and what open source tools we can use to make sure once the primary server
>>> which failed over to slave will mark itself as slave.
>> 
>> There's a lot of open source tools to build some HA around PgSQL: Repmgr,
>> Patroni (based on etcd or zookeeper), PAF (based on Pacemaker), etc. You will
>> have to spend a lot of time to make extensive tests, understand them, pick one
>> and document your cluster.
>> 
>> Regards,
>> 
>