On 9/18/19 9:40 PM, Ron wrote: > > I'm concerned with one pgbackrest process stepping over another one and > the restore (or the "pg_ctl start" recovery phase) accidentally > corrupting the production database by writing WAL files to the original > cluster. This is not an issue unless you seriously game the system. When a cluster is promoted it selects a new timeline and all WAL will be archived to the repo on that new timeline. It's possible to promote a cluster without a timeline switch by tricking it but this is obviously a bad idea. So, if you promote the new cluster and forget to disable archive_command there will be no conflict because the clusters will be generating WAL on separate timelines. In the case of a future failover a higher timeline will be selected so there still won't be a conflict. Unfortunately, that dead WAL from the rogue cluster will persist in the repo until an PostgreSQL upgrade because expire doesn't know when it can be removed since it has no context. We're not quite sure how to handle this but it seems a relatively minor issue, at least as far as consistency is concerned. If you do have a split-brain situation where two primaries are archiving on the same timeline then first-in wins. WAL from the losing primary will be rejected. Regards, -- -David david@xxxxxxxxxxxxx