Hi All,
we are currently using streaming replication on multiple node pairs. We are seeing some issues, but I am mainly interrested in clarification.
When a failover occurs, we touch the trigger file, promoting the previous slave to master. That works perfectly.
For recycling the previous master, we create a recovery.conf (with recovery_target_timeline = 'latest') and *try* to start up. If postgresql starts up, we accept it as a new slave. If it does not, we proceed with a full basebackup.
This approach seems to work, but I have found indications that it can lead to database corruption: http://hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf
I am mainly seeking understanding of if and why this approach is a bad idea.
Thanks,
Fredrik Huitfeldt