Re: pg_upgrade + replica servers + rsync --size-only is unsafe

Stephen Frost <sfrost@xxxxxxxxxxx> · Wed, 2 Dec 2020 12:28:05 -0500

Greetings,

* Rob Emery (re-pgsql@xxxxxxxxxxxxxxx) wrote:
> We're pretty sure that we've just hit a scenario where the instructions for
> pg_upgrade with standby servers aren't quite safe.

I've pretty much always felt those instructions should come with a
"expert use only" note.

> We were testing our upgrade process with a copy of our live cluster in a
> lab environment and we found that using
> 
> rsync --verbose --archive --delete --hard-links --size-only
> --no-inc-recursive /media/postgresql/data/main /media/postgresql/data/9.6
> PGRETESTA02:/media/postgresql/data
> 
> Resulted in:
> 
> 2020-12-02 14:49:11.513 GMT [20884-1] LOG:  database system was shut
> down in recovery at 2020-12-02 13:57:56 GMT
> 2020-12-02 14:49:11.513 GMT [20884-2] LOG:  entering standby mode
> 2020-12-02 14:49:11.557 GMT [20884-3] LOG:  consistent recovery state
> reached at 19E/25000098
> 2020-12-02 14:49:11.557 GMT [20884-4] LOG:  invalid record length at
> 19E/25000098: wanted 24, got 0
> 2020-12-02 14:49:11.559 GMT [20883-1] LOG:  database system is ready
> to accept read only connections
> 2020-12-02 14:49:11.593 GMT [20888-1] FATAL:  database system
> identifier differs between the primary and standby
> 2020-12-02 14:49:11.593 GMT [20888-2] DETAIL:  The primary's
> identifier is 6901669428825624285, the standby's identifier is
> 690161835164
> 1138930.
> 
> however if we don't use --size-only, then the process worked fine and we a
> smooth upgrade.

This doesn't look like a --size-only issue though- how did you end up
with this when the pg_controldata file (where the system identifier is
pulled from) shouldn't even exist and won't be a hardlink to the
existing one on the old system since it's a new cluster, and therefore
should definitely be copied?

When you go to run the rsync to get the replica up to date you should
have a system that looks like this:

PRIMARY:

/srv/old_cluster
   ... lots of files
/srv/new_cluster
   ... lots of *new* files, including pg_controldata
   ... hardlinks to PG table/index files that aren't part of the catalog

REPLICA:

/srv/old_cluster
   ... lots of files
/srv/new_cluster
   ... entirely empty

Then you run the rsync, at the /srv level, and all those 'new files' in
/srv/new_cluster on the primary should get copied over to
/srv/new_cluster on the replica, while any files which are hard-linked
between old_cluster and new_cluster should end up as hard links on the
replica.

Thanks,

Stephen
Attachment:
signature.asc

Description: PGP signature