On Wed, Nov 23, 2016 at 1:16 PM, Israel Brewster <israel@xxxxxxxxxxxxxx> wrote:
I was wondering if someone could inform me about, or point me to an online article about best practices for backing up a postgresql database cluster? At the moment, I have two servers running with streaming replication for failover purposes, and then I do nightly database dumps for recovery/backup purposes. However, I feel this approach is lacking in a couple areas I can think of:
The first thing you have to do is list all of the problems you want to solve by having backups. You did list some of them here, but you should think long and hard about it. I did not see anything about off-site backups for example. Those are necessary if you want to protect yourself against something like a fire destroying everything at your primary location. Consider your business costs for losing X hours of data vs the cost of storing that data. The backup and recovery you come up with will have to be tailored to your needs.
For my use case I do the following:
1 live backup using slony for immediate server failover.
Nightly pg_dump snapshots kept for 2 weeks for customer "pilot error" recovery
The nightly pg_dump files are kept for 2 years for the first of every month to help recover from pilot error. So far the longest back I've needed to restore has been about 6 months, but disk space is cheap for me. These files are kept on a third server that is not one of the two primary DB servers.
The pg_dump files are rsync'd from my data center to a filer at my main office about 40 miles away, for my off-site disaster recovery.
If I could afford a faster and more reliable network connection at the office, I'd set up live streaming backup offsite as well. However, the physical location makes that not likely to happen any time soon :(
Also consider the extra burden that doing your backups puts on your servers. Do they have enough I/O capacity to handle it, especially at peak demand times?