David Steele <david@xxxxxxxxxxxxx> writes: > On 9/18/19 9:40 PM, Ron wrote: > >> >> I'm concerned with one pgbackrest process stepping over another one and >> the restore (or the "pg_ctl start" recovery phase) accidentally >> corrupting the production database by writing WAL files to the original >> cluster. > > This is not an issue unless you seriously game the system. When a And/or your recovery system is running archive_mode=always :-) I don't know how popular that setting value is but that plus an identical archive_command as the origin... duplicate archival with whatever consequences. Disclaimer: I don't know if pgbackrest guards against such a configuration. > cluster is promoted it selects a new timeline and all WAL will be > archived to the repo on that new timeline. It's possible to promote a > cluster without a timeline switch by tricking it but this is obviously a > bad idea. > > So, if you promote the new cluster and forget to disable archive_command > there will be no conflict because the clusters will be generating WAL on > separate timelines. > > In the case of a future failover a higher timeline will be selected so > there still won't be a conflict. > > Unfortunately, that dead WAL from the rogue cluster will persist in the > repo until an PostgreSQL upgrade because expire doesn't know when it can > be removed since it has no context. We're not quite sure how to handle > this but it seems a relatively minor issue, at least as far as > consistency is concerned. > > If you do have a split-brain situation where two primaries are archiving > on the same timeline then first-in wins. WAL from the losing primary > will be rejected. > > Regards, -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@xxxxxxxxxxx