Greetings, * Lucas (root@xxxxxxx) wrote: > On 26/10/2021, at 6:13 AM, Stephen Frost <sfrost@xxxxxxxxxxx> wrote: > > * Mladen Gogala (gogala.mladen@xxxxxxxxx) wrote: > >> On 10/23/21 23:12, Lucas wrote: > >>> This has proven to work very well for me. I had to restore a few backups > >>> already and it always worked. The bad part is that I need to stop the > >>> database before performing the Snapshot, for data integrity, so that means > >>> that I have a hot-standby server only for these snapshots. > >>> Lucas > >> > >> Actually, you don't need to stop the database. You need to execute > >> pg_start_backup() before taking a snapshot and then pg_stop_backup() when > >> the snapshot is done. You will need to recover the database when you finish > >> the restore but you will not lose any data. I know that pg_begin_backup() > >> and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any > >> API for storage or file system snapshots, that's the only thing that can > >> help you use storage snapshots as backups. To my knowledge,the only database > >> that does have API for storage snapshots is DB2. The API is called "Advanced > >> Copy Services" or ACS. It's documented here: > >> > >> https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs > >> > >> For Postgres, the old begin/stop backup functions should be sufficient. > > > > No, it's not- you must also be sure to archive any WAL that's generated > > between the pg_start_backup and pg_stop_backup and then to be sure and > > add into the snapshot the appropriate signal files or recovery.conf, > > depending on PG version, to indicate that you're restoring from a backup > > and make sure that the WAL is made available via restore_command. > > > > Just doing stat/stop backup is *not* enough and you run the risk of > > having an invalid backup or corruption when you restore. > > > > If the entire system is on a single volume then you could possibly just > > take a snapshot of it (without any start/stop backup stuff) but it's > > very risky to do that and then try to do PITR with it because we don't > > know where consistency is reached in such a case (we *must* play all the > > way through to the end of the WAL which existed at the time of the > > snapshot in order to reach consistency). > > > > In the end though, really, it's much, much, much better to use a proper > > backup and archiving tool that's written specifically for PG than to try > > and roll your own, using snapshots or not. > > When I create a snapshot, the script gets the latest WAL file applied from [1] and adds that information to the Snapshot Tags in AWS. I then use that information in the future when restoring the snapshot. The script will read the tag and it will download 50 WAL Files before that and all the WAL files after that required. > The WAL files are being backed up to S3. > > I had to restore the database to a PITR state many times, and it always worked very well. > > I also create slaves using the snapshot method. So, I don’t mind having to stop/start the Database for the snapshot process, as it’s proven to work fine for the last 5 years. I have to say that the process used here isn't terribly clear to me (you cleanly shut down the database ... and also copy the WAL files?), so I don't really want to comment on if it's actually correct or not because I can't say one way or the other if it is or isn't. I do want to again stress that I don't recommend writing your own tools for doing backup/restore/PITR and I would caution people against people trying to use this approach you've suggested. Also, being able to tell when such a process *doesn't* work is non-trivial (look at how long it took us to discover the issues around fsync..), so saying that it seems to have worked for a long time for you isn't really enough to make me feel comfortable with it. Thanks, Stephen
Attachment:
signature.asc
Description: PGP signature