On Fri, 07 Apr 2023 13:16:59 +0900 (JST) Tatsuo Ishii <ishii@xxxxxxxxxxxx> wrote: > >> > But, I heard PgPool is still affected by Split brain syndrome. > >> > >> Can you elaborate more? If more than 3 pgpool watchdog nodes (the > >> number of nodes must be odd) are configured, a split brain can be > >> avoided. > > > > Split brain is a hard situation to avoid. I suppose OP is talking about > > PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would > > avoid that. > > Ok, "split brain" means here that there are two or more PostgreSQL > primary serves exist. > > Pgpool-II's watchdog has a feature called "quorum failover" to avoid > the situation. To make this work, you need to configure 3 or more > Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there > are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal > is to avoid that both pg0 and pg1 become primary servers. > > Pgpool-II periodically monitors PostgreSQL healthiness by checking > whether it can reach to the PostgreSQL servers. Suppose w0 and w1 > detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely, > i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1, > pg0 are in a network A, but w2 and pg1 in different network B. A and B > cannot reach each other). > > In this situation if w2 promotes pg1 because w0 seems to be down, then > the system ends up with two primary servers: split brain. > > With quorum failover is enabled, w0, w1, and w2 communicate each other > to vote who is correct (if it cannot communicate, it regards other > watchdog is down). In the case above w0 and w1 are majority and will > win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the > primary. On the other hand, since wg2 looses, and it gives up > promoting pg1, thus the split brain is avoided. > > Note that in the configuration above, clients access the cluster via > VIP. VIP is always controlled by majority watchdog, clients will not > access pg1 because it is set to down status by w0 and w1. > > > To avoid split brain, you need to implement a combinaison of quorum and > > (self-)fencing. > > > > Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved > > with the (hardware) watchdog. You can also implement node fencing through > > the "pre_promote" script to fence the old primary node before promoting the > > new one. > > > > If you need HA with a high level of anti-split-brain security, you'll not be > > able to avoid some sort of fencing, no matter what. > > > > Good luck. > > Well, if you define fencing as STONITH (Shoot The Other Node in the > Head), Pgpool-II does not have the feature. And I believe that's part of what Cen was complaining about: « It is basically a daemon glued together with scripts for which you are entirely responsible for. Any small mistake in failover scripts and cluster enters a broken state. » If you want to build something clean, including fencing, you'll have to handle/dev it by yourself in scripts > However I am not sure STONITH is always mandatory. Sure, it really depend on how much risky you can go and how much complexity you can afford. Some cluster can leave with a 10 minute split brain where some other can not survive a 5s split brain. > I think that depends what you want to avoid using fencing. If the purpose is > to avoid having two primary servers at the same time, Pgpool-II achieve that > as described above. How could you be so sure? See https://www.alteeve.com/w/The_2-Node_Myth « * Quorum is a tool for when things are working predictably * Fencing is a tool for when things go wrong » Regards,