Re: Shortest offline window on database migration

Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxx> · Thu, 30 May 2019 17:31:28 +0200

On Thu, May 30, 2019 at 12:08:04PM -0300, Haroldo Kerry wrote:
Hello,

We are migrating our PostgreSQL 9.6.10 database (with streaming replication
active) to a faster disk array.
We are using this opportunity to enable checksums, so we will have to do a
full backup-restore.
The database size is about 500GB, it takes about 2h:30min for a full
backup, and then about 1h to fully restore it with checksum enabled on the
new array, plus 2h to recreate the replica on the old array.

Although all synthetic tests (pgbench) indicate the new disk array is
faster, we will only be 100% confident once we see its performance on
production, so our backup plan is using our replica database on the older
array. If the new array performance is poor during production ramp up, we
can switch to the replica with little impact to our customers.

Problem is the offline window for backup, restore the full database with
checksum and recreate the replica is about 5h:30m.

One thing that occurred to us to shorten the offline window was restoring
the database to both the master and replica in parallel (of course we would
configure the replica as master do restore the database), that would shave
1h of the total time. Although this is not documented we thought that
restoring the same database to identical servers would result in binary
identical data files.

We tried this in lab. As this is not a kosher way to create a replica, we
ran a checksum comparison of all data files, and we ended up having a lot
of differences. Bummer. Both master and replica worked (no errors on logs),
but we ended up insecure about this path because of the binary differences
on data files.
But in principle it should work, right?

What should work? Backup using pg_dump and restore certainly won't give
you the same binary files - the commit timestamps will be different,
operations may happen in a different order (esp. with parallel restore),
and so on. And the instances don't start as a copy anyway, so there will
be different system IDs, etc.

So no, this is not a valid way to provision master/standby cluster.

Has anyone been through this type of problem?

Unfortunately, I don't think there's a much better solution that what you
initially described - dump/restore, and then creating a replica.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services