On 30/11/18 2:06 μ.μ., Stephen Frost wrote:
Greetings,
* Achilleas Mantzios (achill@xxxxxxxxxxxxxxxxxxxxx) wrote:
we've been running our backup solution for the last 5 months to a second
site which has an unreliable network connection. We had problems with
barman, since it doesn't support backup resume, also no option to disable
the replication slot, in the sense, that it is better to sacrifice the
backup rather than fill up the primary with WALs and bring the primary down.
Another issue is now supporting entirely backing up from the secondary. With
barman this is not possible, streaming (or archiving) must originate from
the primary.So I want to ask two things here :
- Backing up to a remote site over an unreliable channel is a limited use
case by itself, it is useful for local PITR restores on specific
tables/data, or in case the whole primary suffers a disaster. Is there any
other benefit that would justify building a solution for it?
Please don't build your own solution, it's really quite difficult to get
backups done correctly.
By "building" I meant setting up, nothing fancier :)
- I have only read the best reviews about PgBackRest, can PgBackRest address those issues?
Glad to hear you've read good reviews about pgbackrest. As for
addressing these issues, pgbackrest has:
- Backup resume
- Max WAL lag (in other words, you can have it simply start throwing WAL
away if it can't archive it, rather than allowing the primary to run
out of disk space)
This is just superb! In our case we had the following architecture (now barman is defunct) :
Primary (consistent snapshots with pg_start/stop_backup)+ --> reliable net (archive_command via rsync) --> WAL repository
| (async streaming replication)
| (reliable net)
V
Standby --> unreliable net (barman via method rsync + barman streaming from standby ***) --> remote cloud provider site(barman)
So Primary and Standby are in the same cloud provider over consistent (mostly) network, whereas the barman (remote recovery) site communicates over internet. We would like to keep the old
functionality (or even add a new PgBackRest node in the main cloud provider, so the question is : is there a way for archive-push to two different stanzas? Or delegate the archive-push to work from
the Standby ?
*** newer barman docs (2.5) say this is not supported (wasn't so clear in 2.4)
- Backup using the replica, primairly (note that this, currently,
requires access to the primary, but the bulk of the data comes from
the replica)
- Incremental/differential backup
- Parallel backup/resume and parallel archiving/fetching
- Backup verification- we checksum every file backed up and verify those
checksums on a resume, and we make sure that every WAL file needed to
restore the backup has made it into the archive.
- Delta restore
Which I believe covers most of the use-cases you brought up.
When we first implemented backup using the replica we had concerns
regarding doing a 'full' replica-based backup, and we didn't really see
there being a lot of demand for such a use-case (the replica has access
to the primary in general if it's a streaming replica, after all...),
but we might be open to revisiting that.
Thank you a lot! We'll definitely consider PgBackRest.
Thanks!
Stephen
--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt