Greetings, * Tim (timfosho@xxxxxxxxx) wrote: > I'm attempting to create a long term, stand alone backup archive script for > a pretty active and large database (3TB) which will store backups going > back a few months. > Using the combination of pg_basebackup in TAR format, with the `Xn` flag, > and running pg_receivewal in the background, I'm ending up with a backup > that is missing quite a few WAL files. > > Trying to figure out where my understanding of the process is incorrect, > and what I need to do to correct this. > > *Backup *process: > > * Start pg_receivewal to archive location > * Then run: `pg_basebackup -h {host} -p {port} -D - -Ft -Xn > --checkpoint=fast | ...` What's wrong with using -Xs instead..? Basically does exactly what you seem to be trying to do here. > * This backup is taken from a *standby node* Where is the pg_receivewal run? Against the primary or the replica? > * Once pg_basebackup finishes, send SIGINT to pg_receivewal > > *Recovery* Process: > > * Untar basebackup to data directory > * Copy archived WAL files to a separate directory on the same drive > * settings: restore_command = 'cp /var/lib/pgsql/wal_archive/%f %p' > recovery_target = immediate > > > Once I start the DB, I immediately get > > cp: cannot stat '/var/lib/pgsql/wal_archive/0000001200003CF80000003F': No > such file or directory > > > > *WAL FILES: * > > 0000001200003CF80000003F << First WAL file the DB looks for > 0000001200003CF700000075 << The next WAL file > ---------------------------------------- << Missing WAL files from > pg_receivewal archive > ... > 0000001200003CF8000000C8 << First WAL file in pg_receivewal archive > > *backup_label *file: > > START WAL LOCATION: 3CF7/75974400 (file 0000001200003CF700000075) > CHECKPOINT LOCATION: 3CF8/3F1139C0 > BACKUP METHOD: streamed > BACKUP FROM: standby > START TIME: 2022-06-08 12:51:52 EDT > LABEL: pg_basebackup base backup > START TIMELINE: 18 > > > I can see that START WAL LOCATION & CHECKPOINT LOCATION are where the > basebackup starts to recover from, but how come these are not in my > pg_receivewal archive, if I'm starting it before starting the pg_basebackup > process? How can I ensure they are part of this archive? Well, using -Xs should do that. > Since this is a recent backup, I have a separate WAL archive from where I > can just recover those using a different restore_command, and once that > missing gap of WAL files is recovered from this archive, stopping the > server > and switching back to `restore_command = 'cp /var/lib/pgsql/wal_archive/%f > %p'` and recovering the rest of the WAL files from the pg_receivewal > archive works fine. The DB recovers and accepts connections shortly after. > > The whole point of this is to have a several months old, stand alone > backup, I need to have all the WAL files available in that original WAL > archive created with pg_receivewal. The use-case generally makes sense, in pgbackrest we have --archive-copy for more-or-less the same kind of thing, though you can now just use a separate newly created repo that you back up the standalone backup to (when multiple repos are configured, archive-push will archive to all of them) and then archive that copy. The latter is what we'd generally recommend these days, but with archive-copy and a full backup, you can just grab the full backup directory from inside the repo and it'll have everything. Thanks, Stephen
Attachment:
signature.asc
Description: PGP signature