Search Postgresql Archives

Re: BDR and Backup and Recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 18 November 2015 at 23:46, Will McCormick <wmccormick@xxxxxxxxx> wrote:
What viable options exist for Backup & Recovery in a BDR environment? From the reading I have done PITR recovery is not an option with BDR. It's important to preface this that I have almost no exposure to postgres backup and recovery. Is PITR not an option with BDR?

If a user fat fingers something and deletes records from a table without a where clause what is the correct course of action is to recover as much data as possible. What type of backup do I require to restore as much data as possible before the incident in a BDR environment.

Sorry for such an open ended question. :D I'm continuing to read as I solicit feedback.

Is there a document outlining recovery with BDR?

Not yet, and that's a signficant oversight.

There are really two options:

* Periodic dumps with pg_dump; or
* PITR, like with PgBarman

Both have associated challenges.

Importantly, you *cannot* run a physical streaming replica of a node and promote it to replace a failed node. I'll explain why below.


PG_DUMP
---

The simplest option is to take periodic dumps of one of the nodes. If you have to restore, you tear down the whole system, create a new standalone node, restore the dump, strip out the old bdr.bdr_nodes and bdr.bdr_connections data, then bring up a new cluster with the restored node as the first node.

0.10.0 will include a built-in function to strip all BDR state from a node and turn it back into a standalone Pg database, which will make this easier. For now, see https://github.com/2ndQuadrant/bdr/issues/127 for steps to de-BDR-ize a database.



PITR
---

Alternately you can use the usual physical WAL archiving and base backup method. 

If you have to restore to recover a node it cannot just rejoin the cluster. Its connections will be rejected because its timeline has changed. This is because the other peers might've replayed data to the old node that has been discarded, and will not be replayed again to the restored node. So it'd create a gap in history and as a result, divergence between nodes.

Instead you must re-clone the node from another still-alive node.

If the whole cluster is lost you you can do a PITR restore, strip all BDR state from the restored database, then bring it up as the first node in a new cluster, exactly as if you'd restored a dump.


Why can't you have local physical replicas for node failover?
---

It'd be nice to be able to have local streaming replicas for each node in a distributed setup. That way if you lose a node, instead of having to re-clone it over a possibly slow/expensive WAN link, you can just promote the standby.

This isn't currently possible.  The main reason is that PostgreSQL does not replay replication state (and in 9.5, replication identifier) state and replay it to standbys. The standby node that gets promoted has no idea what the replay position of the BDR peer nodes is or what position it had replayed to from its peers. It could replay data twice, or miss data, and the same could happen to its peers. Divergence would result.

To fix this we need PostgreSQL to replicate slot and replication identifier state to physical streaming replicas. It'd be usable for PITR too, that way. There's work afoot to make that possible in 9.6, but it's never going to be possible in 9.4-based BDR, so you can't use a streaming replica standby to replace a failed node without a whole-cluster rebuild.

(There are more complexities here, too, regarding async replicas, slots that get advanced past the point the replica has data for, etc. We need a way to delay advancing a peer's slot until we've confirmed the local streaming replica has committed.)

See https://github.com/2ndQuadrant/bdr/issues/98



REPLICATION SETS
----

If you have replication sets where no single node has complete information, it's harder.

Neither pg_dump or WAL archiving for PITR can capture non-replicated tables that aren't on the local node. So if you have a complicated arrangement of replication sets you have to do some hoop-jumping with pg_dump and scripting to make sure you get a complete set of dumps of all your tables in different replication sets on different nodes. Recovery in this case consists of creating a new cluster, restoring the dump from one node to it, then configuring the replication sets on each other node and restoring the separately-dumped node-local tables to those node(s). The gentler DDL lock in 0.10.0 should make it possible to quiesce writes and force global consistency for long enough to acquire a snapshot on each node so you can get consistent dumps even with replication sets, but there's still a fair bit of manual work involved.

If you're using PITR the concept is similar. You PITR-restore one node, strip all BDR configuration from it to make it back into a standalone DB, then use it to set up a new BDR cluster with all new nodes. On the other nodes you have to restore them temporarily, dump the tables that aren't in the first node's replication sets, and restore them.

I'd really like to bring together a more complete picture here, but the development time currently available has to focus on robustness work and on progress toward 9.6. As always, contribution would be greatly valued, whether in terms of docs or code.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux