Guten Tag Laurenz Albe, am Freitag, 23. Juli 2021 um 12:41 schrieben Sie: > The only way to cope with that is to replay > WAL to a point in time *after* the last data file was backed up. > But with your proposed backup method, you have no way to determine > that point in time, so this is not safe at all. I think your example is bad: If both rows are added in independent transactions, possibly inconsistent table data simply needs to be accepted. No backup mechanism will change anything on that, because the second row might always be inserted "too late". If both rows are inserted in the same transaction, things depend on when that got committed: If it's before pg_stop_backup, it's at least contained in the last WAL and Postgres can recover to that point. If it's committed after pg_stop_backup, the change is lost regardless how files have been copied before. > This is exactly the purpose of pg_stop_backup: it does not only > archive the WAL segment that completes the backup, but it also adds > a BACKUP_END record to the WAL stream. With having pg_start_backup+pg_stop_backup that record is available. > That ensures that recovery > cannot stop too early: any attempt to end recovery before reaching > BACKUP_END will cause a fatal error. And here's the point: File system snapshots don't have that record at all, but are considered safe because of the crash safety guarantees built into Postgres. So as long as the data directory makes somewhat sense, Postgres obviously is able to start and does some recovering. That recovering is the only difference: In case of file system snapshots it recovers beginning from the last CHECKPOINT, as that is considered safe, and applies additional WALs as available. With pg_start_backup+copy+pg_stop_backup, the state of the finally copied data directory is unknown as well. It might be the same checkpoint when pg_start_backup finished or might be a newer one, because copying data took so long or whatever. For recovery, Postgres can only rely on the currently available checkpoint of the data directory OR what is stored in the backup_label file: > START WAL LOCATION: 428/40000060 (file 000000010000042800000040) > CHECKPOINT LOCATION: 428/40000098 > BACKUP METHOD: pg_start_backup > BACKUP FROM: master > START TIME: 2021-07-23 08:05:25 CEST > LABEL: postgres_files_full_on_ext4 > START TIMELINE: 1 So what does it do? From my understanding it starts from the checkpoint location of the file and recovers using the available WALs. Because that is the only way to not rely on the state of the copied data directory too much. Or does it take a more current checkpoint available in the data directory into account? From my understanding it can't: The files declaring that more current checkpoint might have been copied after all other files have been copied already, but without the data necessary for the checkpoint in the end. To prevent that, Postgres starts in the past by using CHECKPOINT LOCATION. And here's the point again: When it always starts from the checkpoint stored in backup_label, from a crash safety perspective it doesn't make any difference if files are copied before or after pg_stop_backup. The only difference would be that all data during the copying process itself would need to be considered lost. But that might easily be acceptable, backups become outdated all the time. > As soon as BACKUP_END is > processed, the startup process knows that the database is now > consistent (and logs a message to that extent). BACKUP_END will still be processed when copying after pg_stop_backup. > You can contine > recovering to any later point in time, but you need not > (recovery_target = immediate). Because things depend on the available WALs and not the files in the data directory. This would still be the case when copying after pg_stop_backup. > With your proposed method, you run the danger of stopping recovery too > early, with the consequence of ending up with a corrupted database. But BACKUP_END would be available in both cases. Mit freundlichen Grüßen Thorsten Schöning -- AM-SoFT IT-Service - Bitstore Hameln GmbH Mitglied der Bitstore Gruppe - Ihr Full-Service-Dienstleister für IT und TK E-Mail: Thorsten.Schoening@xxxxxxxxxx Web: http://www.AM-SoFT.de/ Tel: 05151- 9468- 0 Tel: 05151- 9468-55 Fax: 05151- 9468-88 Mobil: 0178-8 9468-04 AM-SoFT IT-Service - Bitstore Hameln GmbH, Brandenburger Str. 7c, 31789 Hameln AG Hannover HRB 221853 - Geschäftsführer: Janine Galonska Für Rückfragen stehe ich Ihnen sehr gerne zur Verfügung. Mit freundlichen Grüßen Thorsten Schöning Tel: 05151 9468 0 Fax: 05151 9468 88 Mobil: Webseite: https://www.am-soft.de AM-Soft IT-Service - Bitstore Hameln GmbH i.G. ist ein Mitglied der Bitstore Gruppe - Ihr Full-Service-Dienstleister für IT und TK AM-Soft IT-Service - Bitstore Hameln GmbH i.G. Brandenburger Str. 7c 31789 Hameln Tel: 05151 9468 0 Bitstore IT-Consulting GmbH Zentrale - Berlin Lichtenberg Frankfurter Allee 285 10317 Berlin Tel: 030 453 087 80 CBS IT-Service - Bitstore Kaulsdorf UG Tel: 030 453 087 880 1 Büro Dallgow-Döberitz Tel: 03322 507 020 Büro Kloster Lehnin Tel: 033207 566 530 PCE IT-Service - Bitstore Darmstadt UG Darmstadt Tel: 06151 392 973 0 Büro Neuruppin Tel: 033932 606 090 ACI EDV Systemhaus - Bitstore Dresden GmbH Dresden Tel: 0351 254 410 Das Systemhaus - Bitstore Magdeburg GmbH Magdeburg Tel: 0391 636 651 0 Allerdata.IT - Bitstore Wittenberg GmbH Wittenberg Tel: 03491 876 735 7 Büro Liebenwalde Tel: 033054 810 00 HSA - das Büro - Bitstore Altenburg UG Altenburg Tel: 0344 784 390 97 Bitstore IT – Consulting GmbH NL Piesteritz Piesteritz Tel: 03491 644 868 6 Solltec IT-Services - Bitstore Braunschweig UG Braunschweig Tel: 0531 206 068 0 MF Computer Service - Bitstore Gütersloh GmbH Gütersloh Tel: 05245 920 809 3 Firmensitz: AM-Soft IT-Service - Bitstore Hameln GmbH i.G. , Brandenburger Str. 7c , 31789 Hameln Geschäftsführer Janine Galonska