Re: Bad recovery: no pg_xlog/RECOVERYXLOG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark,

* Mark Kirkwood (mark.kirkwood@xxxxxxxxxxxxxxx) wrote:
> On 31/10/17 04:47, Stephen Frost wrote:
> >* Marcin Koziej (marcin@xxxxxxxxxx) wrote:
> >>Now it's fixed, but if anyone needs I'm attaching all scripts to 1)
> >>backup and restore wal's and 2) backup and restore base backup from
> >>OpenStack SWIFT
> >Interesting, but these scripts seem to be seriously lacking in error
> >checking (what happens if the copy to swift fails..?  or pg_basebackup
> >fails?) and it's unclear how you can be sure that the WAL file has been
> >sync'd to disk which is important or you might end up having holes in
> >your WAL stream if the swift system fails.  There's also no checking to
> >make sure that the WAL needed for a given pg_basebackup ever actually
> >made it to the swift system, which is required to ensure you have a
> >consistent backup.
> >
> >Generally speaking, these kinds of scripts really aren't a good choice
> >for doing backups of PG.  I'd strongly suggest you look at one of the
> >existing tools which are developed specifically for doing backups of PG
> >and are well tested, supported, and maintained.  If you'd like support
> >for a new storage system, I know that at least pgBackRest's storage
> >layer is pluggable and adding a new storage option is pretty straight
> >forward.
>
> I'm not convinced that his approach is bad.

I was the same way for a long time, thinking that shell scripts could
reasonably be used with certain caveats, but the devil really is in the
details and it's far too easy to miss things in shell scripts (such as
not checking return codes, or not doing so properly, or various other
issues).  Also, you didn't address things like verifying that you
actually have all the WAL needed for a valid backup, and how to handle
retention?

> The script checks the result of the 'swift upload' for the base
> backup, it is the wal backup one that does not explicitly check the
> 'swift upload' result (this should really be added). To be fair,
> anything wrong with the swift system will likely be discovered
> immediately beforehand where he does a 'swift stat'!

Things could certainly break between those two calls to swift, in a
variety of ways.

> I'd guess his original problem was an improperly setup
> recovery.conf, rather than the overall design.

I agree that the original issue is unlikely to be related to these
scripts.  That doesn't mean that using them is a good idea.

Thanks!

Stephen

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux