On Fri, 2021-07-23 at 11:56 +0200, Thorsten Schöning wrote: > So a process of "pg_start_backup", directly followed by a > "pg_stop_backup" with assuring that the WALs during both calls are > preserved, should result in a valid backup, shouldn't it? > > An afterwards copied data directory is crash safe again up to the > point "pg_start_backup" has been called and can recover using the WALs > up to the point "pg_stop_backup" has been called. Regardless how long > copying the data directory takes and which changes Postgres applies. > Those changes are simply discarded during recovery, aren't they? > > So one might lose changes during the backup process by not following > the order of the docs, but again, I might not care, like with file > system snapshots. > > Or what's the difference when looking at consistency/recovery compared > to some point in time only between FS level snapshots and using > pg_start|stop_backup+WALs in different order? > > From my understanding, pg_start_backup+pg_stop_backup+FS-snapshot > should even be the most correct approach, as long as the created WALs > are contained in the snapshot. Other thoughts? No, this is all wrong. You need to copy/snapshot the files *between* pg_start_backup and pg_stop_backup. To make that clear, think of two tables with a foreign key relationship: A and B. Now you perform a "backup" like you envision. Now the files for A and B get backed up at different times, let's say that A gets backed up first. Now a client inserts one row in A and one in B that reference each other. The row are inserted after A is backed up, but before B is backed up. Now the backed up files are inconsistent - there is a row in B with no matching row in A. The only way to cope with that is to replay WAL to a point in time *after* the last data file was backed up. But with your proposed backup method, you have no way to determine that point in time, so this is not safe at all. This is exactly the purpose of pg_stop_backup: it does not only archive the WAL segment that completes the backup, but it also adds a BACKUP_END record to the WAL stream. That ensures that recovery cannot stop too early: any attempt to end recovery before reaching BACKUP_END will cause a fatal error. As soon as BACKUP_END is processed, the startup process knows that the database is now consistent (and logs a message to that extent). You can contine recovering to any later point in time, but you need not (recovery_target = immediate). With your proposed method, you run the danger of stopping recovery too early, with the consequence of ending up with a corrupted database. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com