-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Rajesh Kumar Mallah wrote: |>| Do you see any problem in the current approach ? |>| i have seen it working fine till now. |> |>I do, to be honest. The WAL location counter accounts for 4294967295 |>positions and while I'm certain that's WAY more than the average number |>of transactions that go into a WAL, quite a number of small ones can |>certainly happen before a WAL is rolled over, and until then, you're |>dealing with the same log file. |> |>If two backups happen in that period of time for whatever reason, you're |>going to have a false positive by looking into ${WAL_ARCHIVE} and |>searching just for the WAL name, so including the location in the search |>of a WAL fragment is certainly necessary. Infact, going purely by |>chance, the probability of hitting the same location in two different |>log files in two subsequent backups is much lower than hitting the same |>WAL twice. | | The current wal log is not being removed from the wal archive area | in any case. The files less than the current ones are being rm'ed. | | I am sorry i am not able to get your apprehension. But i shall | surely try harder to understand your point. Hi Rajesh, list. I'm sorry I didn't get back to you earlier, I was at an IBM business conference for a couple of days; not to say it rendered me incapable of communicating via e-mail, but it did bring along certain social responsibilities which caused me to both stay up and sleep late, if you know what I mean. :) Let me explain the above predicament in more practical terms. Let us say you started a backup very soon after a WAL had been rolled over. Current WAL at that time was called, for example, ${PGDATA}/pg_xlog/000000010000000E0000000A. The location at that time was 000F594A (iow, early in the WAL cycle). [disclaimer: all events in this story are entirely fictional, any similarity to actual persons and events is purely coincidental :) ] pg_start_backup() will create a WAL backup: ~ ${PGDATA}/pg_xlog/000000010000000E0000000A.000F594A.backup which will be archived to ${WAL_ARCHIVE} under the same name, or possibly given a different extension, depending on archive_method. Let us assume for the purpose of this explanation, that archive_method consists only of cp -i </dev/null, although the problem would have been identical if one used gzip -c, for example. Now, this backup fails for whatever reason (rsync trouble, etc.). You abort it and leave WAL archive as it was. You diagnose the problem that caused the backup to fail and repeat the procedure. And since your diagnostic skills are so good it took you almost no time to fix it, the database engine is now at location 002D94AF in that _same_ WAL. Once you restart the backup script, pg_start_backup() is called and ${PGDATA}/pg_xlog/000000010000000E0000000A.002D94AF.backup is created and archived to ${WAL_ARCHIVE} under that same name. Your method of discovering logs to delete will now match _two_ "current" log file archives instead of one, because they both come from the same WAL, fail to actually delete the stale one (the one from position 000F594A) and thus clutter your backup with irrelevant WAL fragments. The second part of the second paragraph was only to expose that, following the same logic as outlined above, if you take WAL locations as the criterion of removing stale WAL fragments instead of WAL names, it is far less likely to hit a false positive, because you would have to pg_start_backup() _exactly_ 4294967296 locations after the first one. Of course, you want to be unambiguous in your search of the perfect WAL archive, so you want to use _both_ WAL name and location as the criterion. | the old log files without the base backup are not useful. since | rsync is being used to optimise the copying by overwriting the | base backup everytime, i dont thing preserving the old files | makes sense. Had it been and non overwritng backup the files | would have made sense. I see. I was assuming you used rsync to copy the database cluster somewhere then tar it there, while it was lying still ("Fell, Destroyed" of Fugazi comes to mind :) ). I will get back to you with the review of your script later. A quick scan reveals there is not much left to be improved, though. Kind regards, - -- ~ Grega Bremec ~ gregab at p0f dot net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFEN68Vfu4IwuB3+XoRAxbjAJ49Hg6d9J0RZLywEZLr3WeD1sHevQCfSOm8 Pr3jEYb/+viTp/OAoJ6a5/M= =9e0i -----END PGP SIGNATURE-----