On 7/3/2014 6:26 PM, Bosco Rama wrote:
Hmmm. You are using '--oids' to *include* large objects? IIRC, that's
not the intent of that option. Large objects are dumped as part of a
DB-wide dump unless you request that they not be. However, if you
restrict your dumps to specific schemata and/or tables then the large
objects are NOT dumped unless you request that they are. Something to
keep in mind.
I can get rid of --oids and see what happens. I used to have
cross-table references to OID fields before, so this is no doubt a
holdover, but I think I am now using UUIDs for all such links/references
and the OID fields are just like any other data field. It may not be
needed and I'll see if it speeds up the backup and restores correctly.
Many of the large objects are gzip compressed when stored. Would I be
better off letting PG do its compression and remove gzip, or turn off
all PG compression and use gzip? Or perhaps use neither if my large
objects, which take up the bulk of the database, are already compressed?
OK. Given all the above (and that gpg will ALSO do compression unless
told not to), I'd go with the following (note lowercase 'z' in gpg
command). Note also that there may be a CPU vs I/O trade-off here that
may make things muddier but the following are 'conceptually' true.
Fast but big
============
$ pg_dump -Z0 -Fc ... $DB | gpg -z0 ... | split
Less fast but smaller
=====================
$ pg_dump -Z1 -Fc ... $DB | gpg -z0 ... | split
I'll give that a try now. I didn't notice any real time savings when I
changed pg_dump without any -Z param to -Z 0, and oddly, not much of a
difference removing gzip entirely.
BTW, is there any particular reason to do the 'split'?
Yes, I transfer the files to Amazon S3 and there were too many troubles
with one really big file.
Thanks again...