Search Postgresql Archives

Re: Replication failure, slave requesting old segments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/11/2018 12:42 PM, Phil Endecott wrote:
Hi Adrian,

Adrian Klaver wrote:
Looks like the master recycled the WAL's while the slave could not connect.

Yes but... why is that a problem?  The master is copying the WALs to
the backup server using scp, where they remain forever.  The slave gets

To me it looks like that did not happen:

2018-08-11 00:05:50.364 UTC [615] LOG: restored log file "0000000100000007000000D0" from archive scp: backup/postgresql/archivedir/0000000100000007000000D1: No such file or directory 2018-08-11 00:05:51.325 UTC [7208] LOG: started streaming WAL from primary at 7/D0000000 on timeline 1 2018-08-11 00:05:51.325 UTC [7208] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0000000100000007000000D0 has already been removed

Above 0000000100000007000000D0 is gone/recycled on the master and the archived version does not seem to be complete as the streaming replication is trying to find it.


Below you kick the master and it coughs up the files to the archive including *D0 and *D1 on up to *D4 and then the streaming picks using *D5.

2018-08-11 00:55:49.741 UTC [7954] LOG: restored log file "0000000100000007000000D0" from archive 2018-08-11 00:56:12.304 UTC [7954] LOG: restored log file "0000000100000007000000D1" from archive 2018-08-11 00:56:35.481 UTC [7954] LOG: restored log file "0000000100000007000000D2" from archive 2018-08-11 00:56:57.443 UTC [7954] LOG: restored log file "0000000100000007000000D3" from archive 2018-08-11 00:57:21.723 UTC [7954] LOG: restored log file "0000000100000007000000D4" from archive scp: backup/postgresql/archivedir/0000000100000007000000D5: No such file or directory 2018-08-11 00:57:22.915 UTC [7954] LOG: unexpected pageaddr 7/C7000000 in log segment 00000001000000070000
00D5, offset 0
2018-08-11 00:57:23.114 UTC [12348] LOG: started streaming WAL from primary at 7/D5000000 on timeline 1


Best guess is the archiving did not work as expected during:

"(During this time the master was also down for a shorter period.)"

them from there before it starts streaming.  So it shouldn't matter
if the master recycles them, as the slave should be able to get everything
using the combination of scp and then streaming.

Am I missing something about how this sort of replication is supposed to
work?


Thanks, Phil.







--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux