Re: Backup

"Andy Shellam (Mailing Lists)" <andy.shellam-lists@xxxxxxxxxxxxxxxxx> · Wed, 13 Dec 2006 11:21:59 +0000

This could be implemented as a fairly simple script that does:

- Specify a base backup name (e.g. with date/time)
- Connect to postgres database
- Issue pg_start_backup('base backup name')
- Tar the data cluster directory, excluding pg_xlog
- Issue pg_stop_backup
- Compress the created archive
- Move the created archive to the backup history folder (or tape-drive etc)
- Find the date/time stamp of the latest ".backup" file in the xlog archive
- Remove any WAL files created earlier than this ".backup" file 
(excluding the first WAL file needed - specified as START_WAL_LOCATION 
(within the .backup file)

This process runs on our main PGSQL server at 2:00 in the morning and 
finishes within 4 minutes (on a 2GB database), and our backups can 
restore reliably - we have a base backup from 2:00am, and WAL files 
maintained throughout the day.  We're getting a new WAL file archived 
roughly every 3-6 minutes, so should our server crash (or we feel the 
need) we can restore up to 6 minutes ago.

The archive_command is just a simple copy - "cp %p /path/to/archive/%f"

There's really nothing difficult, it puts you (the administrator) in 
control of every aspect of your backup which is a good thing.

"If this is it, then I'll end up with an old level zero (i.e. full, base ) 
backup and A LOT of level 1 (i.e. transaction log) backup. I think it should 
be more like it's for Informix, where you ask the database to perform an 
online level 0 backup (base); after this, it stores transaction logs on disk, 
which you can archive with level 1 backup. Then, say everyonce in a week, you 
get another level 0 backup, and the database clears the already-archived logs 
from disk and starts all over."

This to me is exactly the same approach.  If you really want to, you can 
tell PostgreSQL to do an online backup, wait a week (saving all WAL 
logs), then perform another online backup and remove the previous week's 
log files - it's just you'll end up with a lot of log files (dependent 
on transaction frequency and archive settings.)  You don't have to have 
one base backup and then a ton of log files, because it'll take you 
forever to restore it, in fact the more often you perform a base backup 
the better (I had to restore a 2GB database a while backup, with a 
week's worth of WAL files, and it took 10 hours!)

"a lack of -i in cp, for example"

FWIW, -i in copy won't render a backup useless - only if the file to be 
archived has already been archived (in which case it stands a chance 
it's the same file anyway as PG rotates/recycles WAL file numbers.)

Hope this goes some way to helping you out,

Andy.

Eduardo J. Ortega wrote:
Hi:

Well, I don't really like the fact that admin has to specify the archiving and 
restoring command; an error here (a lack of -i in cp, for example) may render  
the backup useless. In addition, the backup is performed only everytime the 
WAL file is filled; i need to take consistent backups every hour or so, and I 
am not sure if that time represents more or less than a WAL file ( i could 
still measure that, i guess). Finally, as I understand, the WAL backup method 
works like this:
1) Take full base FS backup
2) get some way to copy WAL files

If this is it, then I'll end up with an old level zero (i.e. full, base ) 
backup and A LOT of level 1 (i.e. transaction log) backup. I think it should 
be more like it's for Informix, where you ask the database to perform an 
online level 0 backup (base); after this, it stores transaction logs on disk, 
which you can archive with level 1 backup. Then, say everyonce in a week, you 
get another level 0 backup, and the database clears the already-archived logs 
from disk and starts all over.

I guess this could be achieved with PG, but it requieres considerably more 
steps (pg_start_backup, pg_stop_backup, manually cleaning old log files which 
could be an error point), or I am getting something wrong. Besides, why do 
you need to tell the database to stop the backup? shouldn't it stop by itself 
when there's no more information to be archived? 

Perhaps if any of you has this method working, you could provide me with your 
archiving and restoring commands as well as a description of the whole 
process the way you implemented it.

Thanks,

Eduardo.

On Wednesday 13 December 2006 02:39, Shoaib Mir wrote:

To me PITR looks like a very nice solution for incremental backup and even
they can serve as a warm standby. What exactly are the complications you
see with WAL archiving?

---------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)

On 12/13/06, Eduardo J. Ortega <ejortegau@xxxxxxxxxxxx> wrote:

Hi there:

Are there any nice (official or third party) backup utilities for
postgres? I
have a database which is several GB, so pg_dumping it to file and then
bzipping2 every hour is not really the way to go. I've read a little
about using WAL for incremental backup, but it sounds a little
complicated and *very* user-error prone.

(Not sure if this is the right place for it, but i really think that PG's
developers should consider including a viable backup utility in the next
version)

thanks.

--
Eduardo J. Ortega - Linux user #222873
"No fake - I'm a big fan of konqueror, and I use it for everything." --
Linus
Torvalds

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"

p: +44 (0) 121 288 0832/0839
m: +44 (0) 7818 000834