PostgreSQL 12.13 (PGDG packages) in a streaming replication configuration. pgBackrest 2.43 used for WAL archiving and DB backups to cloud storage
--
I'm testing and documenting a DR exercise process where I:
- Cleanly shutdown PG on the primary
- Promote the PG DR replica
- Place the standby.signal file on the old primary and start it up (presumes no other configurations need changing, primary_conninfo etc were already set).
My hope is I could just start the old primary / new replica if it was cleanly shutdown prior to promoting the replica. However when I try to start up that new replica, I'm met with:
LOG: restored log file "00000002000000B70000005A" from archive
LOG: invalid resource manager ID in primary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 17660) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure
LOG: database system is shut down
LOG: invalid resource manager ID in primary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 17660) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure
LOG: database system is shut down
It doesn't appear any WAL files are missing as it finds all the files that it asks for. Am I missing a piece here?
My hope is to avoid having to do a restore to rebuild the new replica.
Aside for those that may be asking: most of these databases do not have data checksums enabled so pg_rewind isn't in the picture. Although I'm reading now that we could enable the wal_log_hints parameter as an alternative. I'm leery of the overhead but if it's the same overhead that would be done with data checksums then I guess there would be nothing lost when we eventually enable them.
Don Seiler
www.seiler.us
www.seiler.us