I'm running a medium-traffic Web site that has been running for a few
years, and which uses about four PostgreSQL databases on a regular
basis. I'm currently running 8.2, although I'm planning to upgrade to
8.3 in the coming week or two, in part because of the problems that I'm
having. The databases consume a combined total of 35 GB. Like a good
boy, I've been backing the system up overnight, when we have less
traffic, since the site began to run. I use pg_dump to back up, saving
both schemas and data for a full restore in case of failure. pg_dump
typically executes from another machine on a local network; if it would
help to run pg_dump locally, then I'm certainly open to doing that.
Over the last month or two, database performance has become increasingly
problematic during the hours that I run pg_dump. Moreover, the size of
the database has gotten to the point where it takes a good number of
hours to dump everything to disk. This ends up interfering with our
users on the East Coast of the United States, when they access our site
early in the morning.
One possible solution is for me to backup our main database more
regularly, and our development database less regularly. But given the
growth in our traffic (about double what it was 12 months ago), I have
to assume that this isn't a long-term solution.
I'm also considering taking our oldest data and sticking into a separate
database (sort of a data warehouse), so that the production database
becomes smaller, and thus easier to back up.
But before I do any of these things, I want to hear what others have
discovered in terms of high-performance backups. Is there a way to stop
pg_dump from locking up the database so much? Is there a knob that I
can turn to do a low-priority backup while the live site is running? Is
there a superior backup strategy than pg_dump every 24 hours?
Thanks in advance for any advice you can offer!
Reuven
--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance