Re: 'replication checkpoint has wrong magic' on the newly cloned replicas

Stephen Frost <sfrost@xxxxxxxxxxx> · Wed, 29 Nov 2017 09:54:55 -0500

Greetings,

* Alex Kliukin (alexk@xxxxxxxxxxxx) wrote:
> The cloning itself is done by copying a compressed image via ssh,
> running the
> following command from the replica:
> 
>  """ssh {master} 'cd {master_datadir} && tar -lcp --exclude "*.conf" \
>          --exclude "recovery.done" \
>          --exclude "pacemaker_instanz" \
>          --exclude "dont_start" \
>          --exclude "pg_log" \
>          --exclude "pg_xlog" \
>          --exclude "postmaster.pid" \
>          --exclude "recovery.done" \
>            * | pigz -1 -p 4' | pigz -d -p 4 | tar -xpmUv -C
>            {slave_datadir}""
> 
> The WAL archiving starts before the copy starts, as the script that
> clones the
> replica checks that the WALs archiving is running before the cloning.

Maybe you've doing it and haven't mentioned it, but you have to use
pg_start/stop_backup because otherwise PG is going to think it's doing
crash recovery from the last checkpoint written, rather than having to
go back to when the backup started and replay all of the WAL from that
point.

Basically, this process is entirely broken unless you're actually taking
a filesystem-level atomic snapshot first (and that has to be atomic
across all tablespaces too).  Perhaps that's what you meant when you
mentioned a snapshot, but if it, then this definitely isn't good.

Note that if you use pg_start/stop_backup, you need to make sure to wait
for the replica to be all the way caught up with where the
'pg_start_backup' was issued on the primary before you start copying
files on the replica.

> We have cloned hundreds of replicas with that procedure and never saw
> any
> issues, also never saw the "replication checkpoint has wrong magic"
> error, so
> we are wondering what could be the possible reason behind that failure?
> We also
> saw the disk error on another shard not long after the initial copy (but
> not on
> those that had the "replication checkpoint error"), so hardware issues
> are on
> our list as well (but then how comes both had the same wrong value for
> the
> "wrong magic"?)

If you've not seen any other corruption due to this, I'd call you
extremely lucky.  I'd strongly suggest you look at some of the existing
tools for doing backup/recovery of PG and use them to build out
replicas.

Thanks!

Stephen
Attachment:
signature.asc

Description: Digital signature