Hi all,
So we have master(pg11.8) and standbyA(pg11.11), 24T data, quite busy data-warehouse on Amazon Linux.
We've built a standbyB(pg11.11) by creating basebackup from standbyA and collecting wals from the master.
When basebackup completed we started recovery from collected wals.
When standbyB caught-up we tried to switch to streaming.
1. created slot on the master
2. activated slot (with short pg_receivewal invocation)
3. did a checkpoint on the master (not sure why)
4. cancelled pg_receivewal.
5. stopped the standbyB and started it. Startup fails with
FATAL: out-of-order XID insertion in KnownAssignedXids
More log attached.
after switching back to log-shipping same error occurs.
Any clues why/what is the error ?
Thanks and regards,
Radoslav
2021-03-31 14:22:36 UTC [41623] 0: [8-1] user=,db= host=,app=[] LOG: received fast shutdown request 2021-03-31 14:22:36 UTC [41623] 0: [9-1] user=,db= host=,app=[] LOG: aborting any active transactions 2021-03-31 14:22:37 UTC [41642] 0: [1483-1] user=,db= host=,app=[] LOG: restartpoint complete: wrote 753696 buffers (2.9%); 0 WAL file(s) added, 0 removed, 345 recycled; write=122.672 s, sync=0.063 s, total=122.871 s; sync files=2588, longest=0.001 s, average=0.001 s; distance=6276805 kB, estimate=7385309 kB 2021-03-31 14:22:37 UTC [41642] 0: [1484-1] user=,db= host=,app=[] LOG: recovery restart point at A8EA1/298E6BB8 2021-03-31 14:22:37 UTC [41642] 0: [1485-1] user=,db= host=,app=[] DETAIL: Last completed transaction was at log time 2021-03-31 14:22:35.091945+00. 2021-03-31 14:22:38 UTC [41642] 0: [1486-1] user=,db= host=,app=[] LOG: shutting down 2021-03-31 14:22:38 UTC [41642] 0: [1487-1] user=,db= host=,app=[] LOG: restartpoint starting: shutdown immediate 2021-03-31 14:22:41 UTC [41642] 0: [1488-1] user=,db= host=,app=[] LOG: restartpoint complete: wrote 304044 buffers (1.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=3.318 s, sync=0.027 s, total=3.348 s; sync files=2154, longest=0.001 s, average=0.001 s; distance=4314192 kB, estimate=7078198 kB 2021-03-31 14:22:41 UTC [41642] 0: [1489-1] user=,db= host=,app=[] LOG: recovery restart point at A8EA2/30DFAEA8 2021-03-31 14:22:41 UTC [41642] 0: [1490-1] user=,db= host=,app=[] DETAIL: Last completed transaction was at log time 2021-03-31 14:22:35.091945+00. 2021-03-31 14:22:41 UTC [41623] 0: [10-1] user=,db= host=,app=[] LOG: database system is shut down 2021-03-31 14:23:03 UTC [84999] 0: [1-1] user=,db= host=,app=[] LOG: database system was shut down in recovery at 2021-03-31 14:22:41 UTC 2021-03-31 14:23:03 UTC [84999] 0: [2-1] user=,db= host=,app=[] LOG: entering standby mode 2021-03-31 14:23:03 UTC [84999] 0: [3-1] user=,db= host=,app=[] LOG: recovered replication state of node 1 to 526/409B7D28 2021-03-31 14:23:03 UTC [84999] 0: [4-1] user=,db= host=,app=[] LOG: recovered replication state of node 2 to 8B75/A97FC4E0 2021-03-31 14:23:03 UTC [84999] 0: [5-1] user=,db= host=,app=[] LOG: recovered replication state of node 19 to 580/B02F81A8 2021-03-31 14:23:03 UTC [84999] 0: [6-1] user=,db= host=,app=[] LOG: recovered replication state of node 6 to 580/B03B13C0 2021-03-31 14:23:03 UTC [84999] 0: [7-1] user=,db= host=,app=[] LOG: recovered replication state of node 5 to 228D/B578FBA8 2021-03-31 14:23:03 UTC [84999] 0: [8-1] user=,db= host=,app=[] LOG: recovered replication state of node 3 to 5F4/6B0D7560 2021-03-31 14:23:03 UTC [84999] 0: [9-1] user=,db= host=,app=[] LOG: recovered replication state of node 7 to 65F/9971C210 2021-03-31 14:23:03 UTC [84999] 0: [10-1] user=,db= host=,app=[] LOG: recovered replication state of node 9 to 65F/B19E5C38 2021-03-31 14:23:03 UTC [84999] 0: [11-1] user=,db= host=,app=[] LOG: recovered replication state of node 10 to 3D8/2E753F28 2021-03-31 14:23:03 UTC [84999] 0: [12-1] user=,db= host=,app=[] LOG: recovered replication state of node 12 to 526/40791D18 2021-03-31 14:23:03 UTC [84999] 0: [13-1] user=,db= host=,app=[] LOG: recovered replication state of node 13 to 521/E87DE3B8 2021-03-31 14:23:03 UTC [84999] 0: [14-1] user=,db= host=,app=[] LOG: recovered replication state of node 4 to 526/409B7EE8 2021-03-31 14:23:03 UTC [84999] 0: [15-1] user=,db= host=,app=[] LOG: recovered replication state of node 15 to 65F/B1817750 2021-03-31 14:23:03 UTC [84999] 0: [16-1] user=,db= host=,app=[] LOG: recovered replication state of node 14 to 200/20004088 2021-03-31 14:23:03 UTC [84999] 0: [17-1] user=,db= host=,app=[] LOG: recovered replication state of node 17 to 63F/5CF62898 2021-03-31 14:23:03 UTC [84999] 0: [18-1] user=,db= host=,app=[] LOG: recovered replication state of node 18 to 96D/9D63BD30 2021-03-31 14:23:03 UTC [84999] 0: [19-1] user=,db= host=,app=[] LOG: recovered replication state of node 16 to DA/43CB4728 2021-03-31 14:23:03 UTC [84999] 0: [20-1] user=,db= host=,app=[] LOG: recovered replication state of node 20 to 580/B03AFB58 2021-03-31 14:23:03 UTC [84999] 0: [21-1] user=,db= host=,app=[] LOG: recovered replication state of node 11 to 580/B02A63C8 2021-03-31 14:23:03 UTC [84999] 0: [22-1] user=,db= host=,app=[] LOG: recovered replication state of node 21 to 37C/BC042508 2021-03-31 14:23:03 UTC [84999] 0: [23-1] user=,db= host=,app=[] LOG: recovered replication state of node 22 to 319/ECA0DCF0 2021-03-31 14:23:03 UTC [84999] 0: [24-1] user=,db= host=,app=[] LOG: redo starts at A8EA2/30DFAEA8 2021-03-31 14:23:04 UTC [84999] 0: [25-1] user=,db= host=,app=[] LOG: 11 KnownAssignedXids (num=11 tail=0 head=11) [0]=27311547 [1]=41706387 [2]=41821378 [3]=44370470 [4]=44843675 [5]=51340860 [6]=53056024 [7]=53575629 [8]=53575647 [9]=53575669 [10]=55152872 2021-03-31 14:23:04 UTC [84999] 0: [26-1] user=,db= host=,app=[] CONTEXT: WAL redo at A8EA2/3A900150 for Standby/RUNNING_XACTS: nextXid 55152875 latestCompletedXid 55152874 oldestRunningXid 4260572256; 13 xacts: 53056024 53575669 41821378 53575647 41706387 53575629 4260572256 51340860 54917157 44843675 44370470 27311547 55152872 2021-03-31 14:23:04 UTC [84999] 0: [27-1] user=,db= host=,app=[] FATAL: out-of-order XID insertion in KnownAssignedXids 2021-03-31 14:23:04 UTC [84999] 0: [28-1] user=,db= host=,app=[] CONTEXT: WAL redo at A8EA2/3A900150 for Standby/RUNNING_XACTS: nextXid 55152875 latestCompletedXid 55152874 oldestRunningXid 4260572256; 13 xacts: 53056024 53575669 41821378 53575647 41706387 53575629 4260572256 51340860 54917157 44843675 44370470 27311547 55152872 2021-03-31 14:23:04 UTC [84985] 0: [7-1] user=,db= host=,app=[] LOG: startup process (PID 84999) exited with exit code 1 2021-03-31 14:23:04 UTC [84985] 0: [8-1] user=,db= host=,app=[] LOG: terminating any other active server processes