Greetings, * Ron (ronljohnsonjr@xxxxxxxxx) wrote: > We've got an old (v8.4.17, thus no parallel backups) 2.9TB database that > needs to be migrated to a new data center and then restored to v9.6.9. You should be using 9.6's pg_dump to perform the export. Might be a bit annoying to do, but you should be able to install it on to a nearby server or the same server as 8.4 is running on but in another location. With 9.6's pg_dump, you could use parallel mode, but you have to prevent anything from changing the data between when the first connection from pg_dump is made until all of the connections have completed and started their transactions (should be just a few seconds, really). Of course, that export won't include any changes after the pg_dump starts, so you'll need a way to manage those. > The database has many large tables full of bytea columns containing pdf > images, and so the dump file is going to be more than 2x larger than the > existing data/base... The dump file isn't going to include any content from indexes and, at least looking at some PDFs locally, they can certainly be compressed effectively sometimes, and they might be getting compressed today in your 8.4 instance thanks to TOAST, and more to the point, the textual representation of a bytea which will end up in the export would almost certainly be compressable too. > The command is: > $ pg_dump -v -Z0 -Fc $DB --file=${TARGET}/${DATE}_${DB}.dump 2> > ${DATE}_${DB}.log > > Using -Z0 because pdf files are already compressed. They aren't really PDF files in the export though- they're bytea's represented in a textual format. Have you tested the difference between using -Z0 and allowing compression to happen? > Because of an intricate web of FK constraints and partitioned tables, the > customer doesn't trust a set of "partitioned" backups using --table= and > regular expressions (the names of those big tables all have the year in > them), and so am stuck with a single-threaded backup. All of the FKs will be re-checked when the data is imported into the new instance. > Are there any config file elements that I can tweak (extra points for not > having to restart postgres) to make it run faster, or deeper knowledge of > how pg_restore works so that I could convince them to let me do the > partitioned backups? pg_restore isn't doing much more than restoring what's in the backup into the database using COPY commands. Since it's an export/import, all the FKs and other constraints will be re-checked and all indexes will be rebuilt during the import. > Lastly, is there any way to not make the backups so large (maybe by using > the --binary-upgrade option, even though the man page says, "in-place > upgrades only")? You could possibly upgrade the existing system from 8.4 to 9.6 in-place (which would require a bit of downtime but typically on the order of minutes instead of many hours) and then take a filesystem-level backup using a tool like pgBackRest and then restore that at the new data as a replica and use streaming replication until you're ready to cut over to the new data center. That's probably how I'd tackle this anyway, though one nice thing about the dump/restore is that you could have checksums enabled on the new cluster. Thanks! Stephen
Attachment:
signature.asc
Description: PGP signature