On Sun, Aug 14, 2011 at 12:44 AM, MirrorX <mirrorx@xxxxxxxxx> wrote:
the issue here is that the server is heavily loaded. the daily traffic is
heavy, which means the db size is increasing every day (by 30 gb on average)
and the size is already pretty large (~2TB).
at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
compression of it and the storing of it in a local storage disk takes about
60 hours while the file size is about 550 GB. the archives are kept in a
different location so that not a problem. so, i dont want even to imagine
how much time the uncompress and copy will take in 'disaster' scenario.
plus, we cannot keep the PGDATA in an older version and just replicate the
wals b/c due to the heavy load they are about 150GB/day. so, even though
that we can suppose that we have unlimited disk storage its not reasonable
to use 5 TB for the wals (if the PGDATA is sent once a month) and
furthermore a lot of time will be needed for the 2nd server to recover since
it will have to process all this huge amount of wals.
We have a pretty similar situation, database size is ~3TB with daily xlog generation of about 25G. We do a full backup (tar PGDATA + xlogs) every fortnight and backup just the xlogs in between. The full backup takes almost 48h and is about 500G in size. All backups are gzipped of course.
The backup duration is not a problem, but the restore _might_ be. We have restored this database more than once, and each time it got fully restored surprisingly quick (a matter of hours). Of course if you have a 24/7 database this might not be acceptable, but then again if that's the case you should have a standby ready anyways.
Regards
Mikko