Re: PITR Base Backup on an idle 8.1 server

Marco Colombo <pgsql@xxxxxxxxxx> · Fri, 01 Jun 2007 16:50:32 +0200

Greg Smith wrote:
On Thu, 31 May 2007, Marco Colombo wrote:

archive_command = 'test ! -f /var/lib/pgsql/backup_lock </dev/null'
Under normal condition (no backup running) this will trick PG into 
thinking that segments get archived. If I'm not mistaken, PG should 
behave exactly as if no archive_command is configured, and recycle 
them ASAP.

That's correct.  I don't think you even need the </dev/null in that 
command.

Ok, thanks. I've seen that </dev/null somewhere in the docs, and blindly 
copied it.

Should a WAL segment fill up during the backup (unlikely as it is, 
since the system is mostly idle AND the tar completes withing a minute 
- but it's still possible), the test command would report failure in 
archiving the segment, and PG would keep it around in pg_xlog, ready 
to be tar'ed  at step 5 (mind you - this is speculation since I had no 
time to actually test it).

That's also correct.  What you're doing will work for getting a useful 
backup.

Great, that's all I need.

However, recognize the limitations of the approach:  this is a clever 
way to make a file-system level snapshot of your database without 
involving the archive logging process.  You'll get a good backup at that 
point, but it won't provide you with any ability to do roll-forward 
recovery if the database gets screwed up in the middle of the day.  
Since that's a requirement of most PITR setups, I'm not sure your 
workaround accomplishes what you really want.  More on why that is below.

Here's the original thread I started.

http://archives.postgresql.org/pgsql-general/2007-05/msg00673.php

Briefly, I don't need PITR proper, it may be even harmful in my case. 
The data on the db may be tied to the data on the filesystem in ways 
unknown to me... think of some kind of custom CMS. I'm able to restore 
.html, .php, .png or whatever files as they were at backup time (say, 
2:00AM). All I need to do with PG backups is restoring db contents at 
the same time (almost). The only point in time I'm interested in is 
backup time, so to say.

Restore would be done the usual way, extracting both the archives, 
maybe adding WAL segments from the crashed pg_xlog. Whether I need to 
configure a fake restore command I have still to find out.

This won't work, and resolving it will require going to grips with the 
full archive logging mechanism rather than working around it the way you 
suggest above.

This is interesting. Why won't it work exactly? Let's say I trick PG in 
thinking it's a recover from backup+archived wal. It'll find all 
segments it needs (and no more) already in pg_xlog. I expect it to just 
use them. Maybe I'd need to configure /bin/false as restore_command. Or 
maybe just something like 'test -f /var/lib/pgsql/data/pg_xlog/%f' (true 
if the file is already there). I'll have to experiment, but I don't see 
any major problem right now. The files are already there.

Every time the server hits a checkpoint, it recycles old WAL 
segments--renames them and then overwrites them with new data.  The 
first time your database hits a checkpoint after your backup is done, 
you will have lost segment files such that it's impossible to recover 
the current state of the database anymore.  You'll have the first part 
of the series (from the base backup), the last ones (from the current 
pg_xlog), but will be missing some number in the middle (the recycled 
files).

Sure, now I see what you mean, but I was under the assumption of very 
low database activity, in may case, it'a about 2 wal segments/day. I 
usually see files in my pg_xlog that are 2 days old, so there won't be 
any missing segments. And anyway, the ability to recover at some time 
after the backup is just a plus. I don't need it. In case of a complete 
crash, I'm going to restore the whole system as it was at backup time. 
And if only the PG datadir gets corrupted later, and I want to try and 
recover it as it was at that later time, still I have a 99% chance of 
being able to do so, due to very low write activity. And if that fails, 
because of some uncommon write activity right at that inconvenient time, 
I can just fall back to the case of a complete system crash. The chances 
of that happing are possibly lower of those of a system crash, so I'm 
not worried about it.

I think that all we want is a backup that is immediately usable, w/o 
waiting for the WAL segment it relies on to be archived. That is, if 
taken at 2:00AM, it may be used to recover a crash at 2:10AM (assuming 
the backup process ended by that time, of course).

If you need *both* a "full backup" *and* PITR, just add a real cp to the 
archive_command above. The important part is to return failure during 
the backup process, I think.

.TM.