Jeff wrote:
If you use PITR + rsync you can create a binary snapshot of the db, so
restore time is simply how long it takes to untar / whatever it into
place. Our backup script basically does:
archive backup directory
pg_start_backup
rsync
pg_stop_backup
voila. I have 2 full copies of the db.
Note that this does NOT protect you against a Pg bug, a filesystem
issue, or a RAID / disk media issue that results in corruption or damage
to the database that isn't immediately detected.
I personally wouldn't want to rely on PITR alone. I take periodic SQL
dumps as well, using PITR more for disaster recovery in case of
software/user error, so I can roll back to when the error occurred
rather than the potentially much earlier time of the last backup.
A dump is much more likely to trip over a problem, so it's good to run
periodically just to make sure the entire DB is readable. It's also more
likely to be restorable if something goes horribly wrong.
Personally I might be OK with WAL-archiving based backups if I was using
a warm-standby server in continuous recovery mode that'd notify me if
anything went wrong with the restore. I'm much more comfortable having
an SQL dump around, though.
You could even expand it a bit
and after the rsync & friends have it fire up the instance and run
pg_dump against it for a pg_restore compatible dump "just in case".
Yep, that's what I do.
It takes a long time to restore a 300GB db, even if you cheat and
parallelify some of it. 8.4 may get a pg_restore that can load in
parallel - which will help somewhat.
It's a great pity that pg_dump can't dump in parallel, though. It makes
sense given that Pg has no mechanism for one backend to "join" anothers'
existing transaction, and without that you'd risk inconsistent
snapshots, but it's still a bit of a pity.
Is it actually that hard to let one backend join another's (strictly
read-only) transaction? The first backend will be ensuring that tuples
in the xmin/xmax range required by the transaction aren't VACUUMed away etc.
--
Craig Ringer
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance