On Sat, Aug 04, 2018 at 07:54:36AM -0700, Christophe Pettus wrote: > Would having pg_rewind do a checkpoint on the source actually cause > anything to break, as opposed to a delay while the checkpoint > completes? Users relying only on streaming without archives would be impacted as potentially two checkpoints could be used on the promoted standby, making all past segment needed from the divergence point not to be around. That's a problem which exists in v11 as only WAL segments worth one checkpoint are kept around, not for 9.5, 9.6 and 10. > The current situation can create a corrupted target, which seems far > worse than just slowing down pg_rewind. Hm? pg_rewind requires the target to be stopped properly, meaning that the divergence point is known to both nodes. If the source is online and has not created the first post-recovery checkpoint, then you would get a no-op with pg_rewind, and when restarting the old master witha recovery.conf you would get a failure. If you stop the old master so as at next startup it needs crash recovery to recover, then there is indeed a risk of corrupted instance, but that would be the same problem even if pg_rewind is used. -- Michael
Attachment:
signature.asc
Description: PGP signature