Re: Backups using non-consistent snapshots (e.g. COW reflinks)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings,

* Wilson, Chris (Chris.Wilson@xxxxxxx) wrote:
> Every day we take a basebackup of our main database, which is about 1.5 TB. We store the backup on the same volume, so that it's easily available if we need to do an emergency recovery (e.g. we can flip the directories and do a point-in-time restore), and we also tar, compress and encrypt it for archival (and then restore that backup to a testing environment to make sure that our backups actually work).

pg_basebackup is a good tool for doing one-off backups of smaller
systems but for larger environments you should seriously consider using
tools which are specifically developed and designed for PG backups, such
as pgbackrest.

Glad to hear that you test your backup restore process.  Not sure what
all is involved in that based on your above comment but I'd strongly
encourage doing things like: perform the restore, bring PG up and have
it replay the WAL to some point in time, then use tools like amcheck to
validate indexes and pg_dump/pg_restore to a newly created system to
ensure all of the heap pages are able to be read and that all
constraints in the DB are still valid.  Bonus- the pg_dump output can
be kept around to provide a quick way to restore an individual table.

> The database volume is on a networked block device (similar to iSCSI) which deduplicates the data. The I/O from taking this basebackup and rewriting the local backup directory is quite heavy on network, storage device CPU and disks, and our IT have asked us to reduce it. While we could do a basebackup in tar format and stream it through compression and encryption, we'd lose the ability to restore it quickly. Snapshots on the block device are an option, but they're controlled by IT, and we don't have direct access to them or the ability to automatically create or mount new snapshots.

pgbackrest supports differential and incremental backups which could be
taken instead of always doing full backups as you're doing with
pg_basebackup.  You should still regularly do full backups though, of
course.

> Another option, since the database volume uses XFS, is taking a reflink copy (COW) onto the same filesystem, which is nearly instantaneous (rm -rf $backup && cp -a -reflink $PGDATA $backup) and avoids actually copying any data. This is however not a consistent snapshot. The manual<https://www.postgresql.org/docs/16/wal-intro.html> says:
> 
> "the physical backup doesn't have to be an instantaneous snapshot of the database state - if it is made over some period of time, then replaying the WAL for that period will fix any internal inconsistencies."

This is true when you're making sure to use the low-level backup system
provided by PG- see link below for documentation on that.

> However it's not clear to me exactly what the requirements for this are - especially how Postgres knows where to start replaying the WAL from. (That may not be the only risk of taking backups this way, but it's the only one that I can think of apart from bugs in the filesystem and cp. I am assuming that all required WAL segments will be available as we keep them in the archive for much longer than the checkpoint interval).
> 
> The manual also says<https://www.postgresql.org/docs/16/wal-internals.html>:
> 
> "After a checkpoint has been made and the WAL flushed, the checkpoint's position is saved in the file pg_control. Therefore, at the start of recovery, the server first reads pg_control and then the checkpoint record; then it performs the REDO operation by scanning forward from the WAL location indicated in the checkpoint record." (emphasis mine)
> 
> So if I save a copy of pg_control first (before the backup starts) and then place it inside the backup directory at the end, that might be enough to ensure that recovery cover the entire period while the backup was running. But what exactly is this "checkpoint record" - is it in the WAL? And if so, what is the location that it indicates? Is the record written at the end of the checkpoint, and it points back to the WAL location at the beginning of the checkpoint?

Do not, do not, do not try to hack up your own process for backup and
restore- read the PG docs on how to do this.  There's far too many ways
to get a copy of the data directory that *looks* like a valid backup and
may even be valid in many cases, except that one time you need to
actually restore you discover that the process wasn't correct and you
ended up without a valid backup.

Even better- use a tool that already exists and does all of this
correctly for you instead of trying to develop your own.

> Or is it enough to run pg_start_backup before taking the copy? Will this prevent any additional checkpoints from being taken (once the function returns) until pg_stop_backup is called? Are there any other risks that I'm missing?

If you're doing a backup of PG, you need to use pg_backup_start/stop (or
pg_start_backup/pg_stop_backup in older versions) and you need to make
sure to store the backup_label (returned from pg_backup_stop) file with
the backup- that's what tells PG where it needs to start WAL replay
from.

You also need to fully read all of the documentation on this here:

https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

and possibly even get into the PG source code if you want to develop a
serious backup tool for PG (eg: there is zero point copying/backing up
UNLOGGED tables but we don't discuss things like this in the backup
documentation nor how to figure out which files are associated with
UNLOGGED tables and which aren't...).

> Thanks in advance for your help and advice. I will try to submit updates to the manual with what I learn.

Certainly would appreciate any suggestions you have for improving the
manual.

Thanks,

Stephen

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux