Reuven M. Lerner wrote: > But before I do any of these things, I want to hear what others have > discovered in terms of high-performance backups. Is there a way to stop > pg_dump from locking up the database so much? Is there a knob that I > can turn to do a low-priority backup while the live site is running? Is > there a superior backup strategy than pg_dump every 24 hours? If you are sysadmin-minded and your operating system & file system support snapshots, an easy solution (and the one I use) is to create a read-only snapshot of the file system with the (binary) database files and back that up. The approach has some benefits: * It won't interfere with "normal" database operations (no locking; though I'm not sure that locking is your problem here as pgsql uses MVCC) * It works at disk speeds instead of converting data back to SQL for storage * Restoring the database works automagically - no need to import the data from SQL back * It's convenient to backup snapshots with usual file system backup utilities. Tar works fine. It also has some significant disadvantages: * The binary database representation is usually much larger than the SQL text one (because of indexes and internal structures). OTOH you can easily use tar with gzip to compress it on the fly. * Technically, the snapshot of the database you're taking represents a corrupted database, which is repaired automatically when it's restored. It's similar to as if you pulled the plug on the server while it was working - PostgreSQL will repair itself. * You cannot restore the database to a different version of PostgreSQL. The same rules apply as if upgrading - for example you can run data from 8.3.0 on 8.3.3 but not from 8.2.0 to 8.3.0. Warning: DO NOT do on-the-fly binary backups without snapshots. Archiving the database directory with tar on a regular file system, while the server is running, will result in an archive that most likely won't work when restored.
Attachment:
signature.asc
Description: OpenPGP digital signature