Restoring normal master-slave roles after replication failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,

I'm terribly sorry if this question has been asked before in a number of different forms; I wasn't able to find quite the nuance I was looking for in the archives.

We're running PostgreSQL 9.4 with streaming replication + hot_standby + the newfangled physical replication slots. On the master, the wal_archive level is 'hot_standby', and on the slave, it's 'archive'[1].

As I have understood the conventional wisdom from scouring the documentation and list archives, if the master fails and we trigger-promote the slave to master and then, after fixing the master server, we need to restore it to the master role and restore the slave server back down to the hot_standby role, we need to reinitialise replication. That is, we need to do a brand new base backup of the slave back to the master.

My question is:

Isn't there a better way that lends itself to some degree of automation? We have a ~500 GB database and while we can do a base backup, it's neither pretty nor quick. Can we not configure $OLD_SLAVE as a master (with WAL senders, replication role and all), put $OLD_MASTER in recovery mode and have it play out the accumulated WALs in reverse from $NEW_MASTER, then shut down $NEW_MASTER, reconfigure both replicas to their appropriate roles, and bring them back up like before?

If this is possible, does it depend on setting wal_archive to 'hot_standby' level on the slave during the course of normal operation, too? Shouldn't 'archive' be sufficient to restore from? What exactly does the 'hot_standby' WAL level contain that 'archive' doesn't?

If I'm wrong, is there no other way to do this except to re-do a whole base backup back to the master? How do people with very large databases deal with this? Just grin and bear it?

Many thanks in advance, and I apologise again if this has previously been answered in some canonical way.

-- Alex

[1] This is as it should be, right? We want to have backup WALs in case we experience a total physical meltdown of the master and need to do PITR, and/or to archive them to WAL-E from the slave if the slave were promoted to master.

--
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/


--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux