Re: pg9.6: PITR paused just after failover, need manual resume

magodo <wztdyl@xxxxxxxx> · Wed, 10 Oct 2018 14:42:48 +0800

On Wed, 2018-10-10 at 11:24 +0800, magodo wrote:
> Hello,
> 
> I'm figuring out how to do PITR on a HA setup for postgres 9.6. I
> almost manage to do it, while I faced following issue:
> 
> 
> [1. create basebackup]
> 
> [2. insert 2 rows]
> 
> on both primary and standby's pg_xlog
> 
> 000000010000000000000005
> 
> rmgr: Transaction len (rec/tot):     34/    34, tx:        633, lsn:
> 0/05016120, prev 0/050160E0, desc: COMMIT 2018-10-10 03:04:58.459754
> UTC  - insert 1 (before: 2018-10-10 11:05:04 +08:00)
> 
> rmgr: Transaction len (rec/tot):     34/    34, tx:        634, lsn:
> 0/05016188, prev 0/05016148, desc: COMMIT 2018-10-10 03:05:07.010478
> UTC  - insert 2 (before: 2018-10-10 11:05:09 +08:00)
> 
> [3. failover and promote the standby]
> 
> 000000010000000000000005 been archived as
> 000000010000000000000005.partial, and a new wal is switched out
> named:
> 000000020000000000000005, which contains above two commits.
> 
> [4. recover to "after insert 1st row"]
> 
> Stop new standby(the rewinded primary), and on the new
> primary(promoted
> standby), do following:
> 
> 1. stop DB
> 2. sync $PGDATA against basebackup
> 3. create a recovery.conf with following content:
> 
>    restore_command = 'cp /var/lib/pgsql/9.6/data/archive/%f %p'
>    recovery_target_timeline = 2
>    recovery_target_time = '2018-10-10 11:05:04 +08:00'
> 
> 4. stat DB
> 
> Then I expect server is recovered to the time before insert 2nd row
> but
> after insert 1st row. The result is as I expected, but it is in a
> "read-only transaction". The log shows following:
> 
> < 2018-10-10 03:09:56.790 UTC > LOG:  database system was
> interrupted;
> last known up at 2018-10-10 02:57:59 UTC
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  restore_command = 'cp
> /var/lib/pgsql/9.6/data/archive/%f %p'
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_timeline = 2
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_time = '2018-
> 10-10 03:05:04+00'
> < 2018-10-10 03:09:56.934 UTC > LOG:  restored log file
> "00000002.history" from archive
> < 2018-10-10 03:09:56.934 UTC > LOG:  starting point-in-time recovery
> to 2018-10-10 03:05:04+00
> < 2018-10-10 03:09:56.936 UTC > LOG:  restored log file
> "00000002.history" from archive
> cp: cannot stat
> '/var/lib/pgsql/9.6/data/archive/000000020000000000000004': No such
> file or directory
> < 2018-10-10 03:09:56.941 UTC > DEBUG:  could not restore file
> "000000020000000000000004" from archive: child process exited with
> exit
> code
> 1                                                                    
>   
>                                                             
> < 2018-10-10 03:09:56.952 UTC > LOG:  restored log file
> "000000010000000000000004" from archive
> < 2018-10-10 03:09:57.012 UTC > DEBUG:  got WAL segment from archive
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  checkpoint record is at
> 0/4000060
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  redo record is at 0/4000028;
> shutdown FALSE
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  next transaction ID: 0:632;
> next OID: 16385
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  next MultiXactId: 1; next
> MultiXactOffset: 0
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest unfrozen transaction
> ID:
> 623, in database 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest MultiXactId: 1, in
> database 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  commit timestamp Xid
> oldest/newest: 0/0
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  transaction ID wrap limit is
> 2147484270, limited by database with OID 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  MultiXactId wrap limit is
> 2147483648, limited by database with OID 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  starting up replication slots
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  restoring replication slot
> from
> "pg_replslot/repl_slot/state"
> < 2018-10-10 03:09:57.015 UTC > DEBUG:  starting up replication
> origin
> progress state
> < 2018-10-10 03:09:57.020 UTC > DEBUG:  resetting unlogged relations:
> cleanup 1 init 0
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  initializing for hot standby
> < 2018-10-10 03:09:57.021 UTC > LOG:  redo starts at 0/4000028
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  recovery snapshots are now
> enabled
> < 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/4000028 for
> Standby/RUNNING_XACTS: nextXid 632 latestCompletedXid 631
> oldestRunningXid
> 632                                                                  
>   
>                                                         
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  end of backup reached
> < 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/40000D0 for
> XLOG/BACKUP_END: 0/4000028
> < 2018-10-10 03:09:57.022 UTC > LOG:  consistent recovery state
> reached
> at 0/40000F8
> < 2018-10-10 03:09:57.023 UTC > DEBUG:  checkpointer updated shared
> memory configuration values
> < 2018-10-10 03:09:57.024 UTC > LOG:  database system is ready to
> accept read only connections
> < 2018-10-10 03:09:57.028 UTC > DEBUG:  archived transaction log file
> "000000010000000000000004"
> < 2018-10-10 03:09:57.035 UTC > DEBUG:  archived transaction log file
> "00000002.history"
> < 2018-10-10 03:09:57.056 UTC > LOG:  restored log file
> "000000020000000000000005" from archive
> < 2018-10-10 03:09:57.116 UTC > DEBUG:  got WAL segment from archive
> < 2018-10-10 03:09:57.117 UTC > LOG:  recovery stopping before commit
> of transaction 634, time 2018-10-10 03:05:07.010478+00
> < 2018-10-10 03:09:57.117 UTC > LOG:  recovery has paused
> < 2018-10-10 03:09:57.117 UTC > HINT:  Execute
> pg_xlog_replay_resume()
> to continue.
> < 2018-10-10 03:09:57.119 UTC > DEBUG:  archived transaction log file
> "000000020000000000000005"
> < 2018-10-10 03:09:57.759 UTC > DEBUG:  forked new backend, pid=254
> socket=11
> < 2018-10-10 03:09:57.771 UTC > DEBUG:  server process (PID 254)
> exited
> with exit code 0
> 
> And I have to execute `pg_xlog_replay_resume()` on the new primary so
> that it will branch out to new timeline 3 and work normally...
> 
> Would anyone please tell me what's going on here? How can I avoid
> doing
> a manual resume?
> 
> ---
> Magodo

I realized that because the basebackup is made from a hot standby, in
which the `hot_standby = on` is set in *postgresql.conf*. 

>From reference manual:

> ... This parameter can only be set at server start. It only has
effect during archive recovery or in standby mode.

So the primary after PITR has become a hot standby... Then I tried a
`pg_ctl promote` on it, but nothing happened.. Why?

---
Magodo