pg9.6: PITR paused just after failover, need manual resume

magodo <wztdyl@xxxxxxxx> · Wed, 10 Oct 2018 11:24:47 +0800

Hello,

I'm figuring out how to do PITR on a HA setup for postgres 9.6. I
almost manage to do it, while I faced following issue:

[1. create basebackup]

[2. insert 2 rows]

on both primary and standby's pg_xlog

000000010000000000000005

rmgr: Transaction len (rec/tot):     34/    34, tx:        633, lsn:
0/05016120, prev 0/050160E0, desc: COMMIT 2018-10-10 03:04:58.459754
UTC  - insert 1 (before: 2018-10-10 11:05:04 +08:00)

rmgr: Transaction len (rec/tot):     34/    34, tx:        634, lsn:
0/05016188, prev 0/05016148, desc: COMMIT 2018-10-10 03:05:07.010478
UTC  - insert 2 (before: 2018-10-10 11:05:09 +08:00)

[3. failover and promote the standby]

000000010000000000000005 been archived as
000000010000000000000005.partial, and a new wal is switched out named:
000000020000000000000005, which contains above two commits.

[4. recover to "after insert 1st row"]

Stop new standby(the rewinded primary), and on the new primary(promoted
standby), do following:

1. stop DB
2. sync $PGDATA against basebackup
3. create a recovery.conf with following content:

   restore_command = 'cp /var/lib/pgsql/9.6/data/archive/%f %p'
   recovery_target_timeline = 2
   recovery_target_time = '2018-10-10 11:05:04 +08:00'

4. stat DB

Then I expect server is recovered to the time before insert 2nd row but
after insert 1st row. The result is as I expected, but it is in a
"read-only transaction". The log shows following:

< 2018-10-10 03:09:56.790 UTC > LOG:  database system was interrupted;
last known up at 2018-10-10 02:57:59 UTC
< 2018-10-10 03:09:56.931 UTC > DEBUG:  restore_command = 'cp
/var/lib/pgsql/9.6/data/archive/%f %p'
< 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_timeline = 2
< 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_time = '2018-
10-10 03:05:04+00'
< 2018-10-10 03:09:56.934 UTC > LOG:  restored log file
"00000002.history" from archive
< 2018-10-10 03:09:56.934 UTC > LOG:  starting point-in-time recovery
to 2018-10-10 03:05:04+00
< 2018-10-10 03:09:56.936 UTC > LOG:  restored log file
"00000002.history" from archive
cp: cannot stat
'/var/lib/pgsql/9.6/data/archive/000000020000000000000004': No such
file or directory
< 2018-10-10 03:09:56.941 UTC > DEBUG:  could not restore file
"000000020000000000000004" from archive: child process exited with exit
code
1                                                                      

< 2018-10-10 03:09:56.952 UTC > LOG:  restored log file
"000000010000000000000004" from archive
< 2018-10-10 03:09:57.012 UTC > DEBUG:  got WAL segment from archive
< 2018-10-10 03:09:57.013 UTC > DEBUG:  checkpoint record is at
0/4000060
< 2018-10-10 03:09:57.013 UTC > DEBUG:  redo record is at 0/4000028;
shutdown FALSE
< 2018-10-10 03:09:57.013 UTC > DEBUG:  next transaction ID: 0:632;
next OID: 16385
< 2018-10-10 03:09:57.013 UTC > DEBUG:  next MultiXactId: 1; next
MultiXactOffset: 0
< 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest unfrozen transaction ID:
623, in database 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest MultiXactId: 1, in
database 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  commit timestamp Xid
oldest/newest: 0/0
< 2018-10-10 03:09:57.013 UTC > DEBUG:  transaction ID wrap limit is
2147484270, limited by database with OID 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  MultiXactId wrap limit is
2147483648, limited by database with OID 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  starting up replication slots
< 2018-10-10 03:09:57.013 UTC > DEBUG:  restoring replication slot from
"pg_replslot/repl_slot/state"
< 2018-10-10 03:09:57.015 UTC > DEBUG:  starting up replication origin
progress state
< 2018-10-10 03:09:57.020 UTC > DEBUG:  resetting unlogged relations:
cleanup 1 init 0
< 2018-10-10 03:09:57.021 UTC > DEBUG:  initializing for hot standby
< 2018-10-10 03:09:57.021 UTC > LOG:  redo starts at 0/4000028
< 2018-10-10 03:09:57.021 UTC > DEBUG:  recovery snapshots are now
enabled
< 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/4000028 for
Standby/RUNNING_XACTS: nextXid 632 latestCompletedXid 631
oldestRunningXid
632                                                                    

< 2018-10-10 03:09:57.021 UTC > DEBUG:  end of backup reached
< 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/40000D0 for
XLOG/BACKUP_END: 0/4000028
< 2018-10-10 03:09:57.022 UTC > LOG:  consistent recovery state reached
at 0/40000F8
< 2018-10-10 03:09:57.023 UTC > DEBUG:  checkpointer updated shared
memory configuration values
< 2018-10-10 03:09:57.024 UTC > LOG:  database system is ready to
accept read only connections
< 2018-10-10 03:09:57.028 UTC > DEBUG:  archived transaction log file
"000000010000000000000004"
< 2018-10-10 03:09:57.035 UTC > DEBUG:  archived transaction log file
"00000002.history"
< 2018-10-10 03:09:57.056 UTC > LOG:  restored log file
"000000020000000000000005" from archive
< 2018-10-10 03:09:57.116 UTC > DEBUG:  got WAL segment from archive
< 2018-10-10 03:09:57.117 UTC > LOG:  recovery stopping before commit
of transaction 634, time 2018-10-10 03:05:07.010478+00
< 2018-10-10 03:09:57.117 UTC > LOG:  recovery has paused
< 2018-10-10 03:09:57.117 UTC > HINT:  Execute pg_xlog_replay_resume()
to continue.
< 2018-10-10 03:09:57.119 UTC > DEBUG:  archived transaction log file
"000000020000000000000005"
< 2018-10-10 03:09:57.759 UTC > DEBUG:  forked new backend, pid=254
socket=11
< 2018-10-10 03:09:57.771 UTC > DEBUG:  server process (PID 254) exited
with exit code 0

And I have to execute `pg_xlog_replay_resume()` on the new primary so
that it will branch out to new timeline 3 and work normally...

Would anyone please tell me what's going on here? How can I avoid doing
a manual resume?

---
Magodo