Greetings, * julyanto SUTANDANG (julyanto@xxxxxxxxxxxx) wrote: > Thanks for elaborating this Information, this is new, so whatever it is the > procedure is *Correct and Workable*. Backups are extremely important, so I get quite concerned when people provide incorrect information regarding them. > > With all of the WAL > > which was created during the backup, PG will be able to recover from the > > changes made during the backup to the data directory, but you *must* > > have all of that WAL, or the backup will be inconsistent because of > > That is rather out of question, because all what we discuss here is just > doing full/snapshot backup. It's unclear what you mean by 'out of question' or why you believe that it matters if it's a full backup or not. Any backup of PG *must* include all of the WAL that was created during the backup. > The backup is Full Backup or Snapshot and it will work whenever needed. > We are not saying about Incremental Backup yet. > Along with collecting the XLOG File, you can have incremental backup and > having complete continuous data backup. > in this case, Stephen is suggesting on using pg_receivexlog or > archive_command > (everything here is actually explained well on the docs)) No, that is not correct. You must have the WAL for a full backup as well. If I understand what you're suggesting, it's that WAL is only for point-in-time-recovery, but that is *not* correct, WAL is required for restoring a full backup to a consistent state. > those changes that were made to the data directory after > > pg_start_backup() was called. > > > > In other words, if you aren't using pg_receivexlog or archive_command, > > your backups are invalid. > > > I doubt that *invalid* here is a valid word > In term of snapshot backup and as long as the snapshot can be run, that is > valid, isn't it? It's absolutely correct, you must have the WAL generated during your backup or the backup is invalid. If, what you mean by 'snapshot' is a *full-system atomic snapshot*, provided by some layer lower than PostgreSQL that is *exactly* as if the machine was physically turned off all at once, then, and *only* then, can you be guaranteed that PG will be able to recover, but the reason for that is because PG will go back to the last checkpoint that happened, as recorded in pg_control, and replay all of the WAL in the pg_xlog/pg_wal directory, which must all exist and be complete for all committed transaction because the WAL was sync'd to disk before the commit was acknowledged and the WAL is not removed until after a checkpoint has completed which has sync'd the data in the data directory out to the filesystem. That's also known as 'crash recovery' and it works precisely because all of the WAL is available at the time of the event and we have a known point to go back to (the checkpoint). During a backup, multiple checkpoints can occur and WAL will be removed from the pg_xlog/pg_wal directory during the backup; WAL which is critical to the consistency of the database and which must be retained by the user because it must be used to perform WAL replay of the database when restoring from the backup which was made. > > if you wanted to backup in later day, you can use rsync then it will copy > > > faster because rsync only copy the difference, rather than copy all the > > > data. > > > > This is *also* incorrect. rsync, by itself, is *not* safe to use for > > doing that kind of incremental backup, unless you enable checksums. The > > reason for this is that rsync has only a 1-second level granularity and > > it is possible (unlikely, though it has been demonstrated) to miss > > changes made to a file within that 1-second window. > > As long as that is not XLOG file, anyway.. as you are saying that wouldn't > be a problem since actually we can run the XLOG for recovery. . No, that's also not correct, unless you keep all WAL since the *first* full backup. The 1-second window concern is regarding the validity of a subsequent incremental backup. This is what happens, more-or-less: 1- File datadir/A is copied by rsync 2- backup starts, user retains all WAL during backup #1 3- File datadir/A is copied by rsync in the same second as backup started 4- File datadir/A is *subsequently* modified by PG and the data is written out to the filesystem, still within the same second as when the backup started 5- The rsync finishes, the backup finishes, all WAL for backup #1 is retained, which includes the changes made to datadir/A during the backup. Everything is fine at this point for backup #1. 6- A new, incremental, backup is started, called backup #2. 7- rsync does *not* copy the file datadir/A because it was not subsequently changed by the user and the timestamp is the same, according to rsync's 1-second-level granularity. 8- The WAL for backup #2 is retained, but it does not contain any of the changes which were made to datadir/A because *those* changes are in the WAL which was written out during backup #1 9- backup #2 completes, with its WAL retainined 10- At this point, backup #2 is an invalid backup. This is not hypothetical, it's been shown to be possible to have this happen. (side-note: this is all from memory, so perhaps there's a detail or two incorrect, but this is the gist of the issue) > > > my latter explanation is: use pg_basebackup, it will do it automatically > > > for you. > > > > Yes, if you are unsure about how to perform a safe backup properly, > > using pg_basebackup or one of the existing backup tools is, by far, the > > best approach. Attempting to roll your own backup system based on rsync > > is not something I am comfortable recommending any more because it is > > *not* simple to do correctly. > > OK, that is fine, and actually we are using that. You must be sure to use one of the methods with pg_basebackup that keeps all of the WAL created during the full backup. That would be one of: pg_basebackup -x, pg_basebackup -X stream, or pg_basebackup + pg_receivexlog. > the reason why i explain about start_backup and stop_backup is to give a > gradual understand, and hoping that people will get the mechanism in the > back understandable. I'm more than happy to have people explaining about pg_start/stop_backup, but I do have an issue when the explanation is incorrect and could cause a user to use a backup method which will result in an invalid backup. Thanks! Stephen
Attachment:
signature.asc
Description: Digital signature