On 7/8/19 9:10 AM, Thorsten Schöning wrote: > Guten Tag David Steele, > am Montag, 8. Juli 2019 um 14:12 schrieben Sie: > >> pg_start_backup() does a checkpoint, but then the database continues >> writing as you copy the files in whatever order you choose. You may >> copy a file that has a partial write or copy some files involved in a >> transaction before it happens and others afterwards -- in fact this is >> normal and expected. > > And because that's expected, Postgres can successfully restore from > that, e.g. having used checkpoints before: No. The data files continue to be modified after the checkpoint while you are copying. The checkpoint is invalidated at the *very first* change. If you start copying files after the pg_start_backup() you will *not* get a copy of the files as they were right after the checkpoint. The database writes continuously, so you will get some invalid state in between the starting checkpoint and the end state (there's no checkpoint at the end). >> The checkpoint constrains the range of WAL that you need, but that WAL >> is absolutely needed to reconstruct the changes that happened during the >> backup. > > Which makes sense if all WAL-archives are simply considered to be > incremental changes based on some former full backup. But that's the > point: I don't see how WAL-archives created between pg_start- and > pg_stop_backup are any different to later ones. Of course one needs > those to not loose data at all, but that doesn't tell anything about > how usable the data directory in itself is already without those. The WAL does not change during a backup. But, these WAL are required to reconstruct the broken state that you get when copying files that are being actively modified. > Postgres seems to have simply defined that they additionally care > about the time when a backup is running. Yes, we care about it because the backup will be inconsistent without those WAL. > Which is fine of course, but > I still don't see any technical or conceptual limitation of not > following that decision. If I backup some VM using snapshots, I don't > necessarily care about the changes made within the VM during the > backup as well. Those are simply handled by the next backup. But there > are additional products streaming all changes to the VM somewhere, if > one needs that. Snapshots are a different story, but they come with their own baggage. > OTOH, it's of course good to have two other opinions to mine when my > boss asks if things are OK the way they are. :-) Seems to be three now. -- -David david@xxxxxxxxxxxxx