Hi I have 1 master and 1 slave wal streaming replication setup and the Application connects via a load balancer (LTM) where the all connections are redirected to the master member (master db). We have archive_mode enabled. I am trying to test to use pg_rewind to restore the new slave (old master) after a failover while the system is under load. Here are the steps I take to test: 1.
Disable the master ltm member (all connections redired to slave member) 2.
Promote slave (touch promote.me) 3.
Stop the master db (old master) 4.
Do pg_rewind on the new slave (old master) 5.
Start the new slave. Here are my results: However, when I tried to start the new slave, I am getting the error that it cannot locate the archive wal files and can not receive data from WAL stream error: Checking the on the new master, I see that the check point that its trying to restore is the file 000000040000009C0000006F, but the file does not exist anywhere on the new master. Not in the pg_xlog or the archive folder. (as specified
in the postgresql.conf) Here is my recovery.conf : standby_mode = 'on' primary_conninfo = 'host=10.69.19.18 user=replicant’ trigger_file = '/var/run/promote_me' restore_command = 'cp /pg_backup/backup/archive_sync/%f "%p"' does anyone know why? Under what conditions will pg_rewind wont’ work? Thanks Dylan |