Search Postgresql Archives

Re: Replication failure, slave requesting old segments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/12/2018 12:53 PM, Phil Endecott wrote:
Phil Endecott wrote:
On the master, I have:

wal_level = replica
archive_mode = on
archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
                    scp %p backup:backup/postgresql/archivedir/%f'

On the slave I have:

standby_mode = 'on'
primary_conninfo = 'user=postgres host=master port=5432'
restore_command = 'scp backup:backup/postgresql/archivedir/%f %p'

hot_standby = on

2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file "0000000100000007000000D0" from archive
scp: backup/postgresql/archivedir/0000000100000007000000D1: No such file or directory
2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from primary at 7/D0000000 on timeline 1
2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 0000000100000007000000D0 has already been removed


I am wondering if I need to set wal_keep_segments to at least 1 or 2 for
this to work.  I currently have it unset and I believe the default is 0.

Given that WAL's are only 16 MB I would probably bump it up to be on safe side, or use:

https://www.postgresql.org/docs/9.6/static/warm-standby.html

26.2.6. Replication Slots

Though the above does not limit storage of WAL's, so a long outage could result in WAL's piling up.


My understanding was that when using archive_command/restore_command to copy
WAL segments it would not be necessary to use wal_keep_segments to retain
files in pg_xlog on the server; the slave can get everything using a
combination of copying files using the restore_command and streaming.
But these lines from the log:

2018-08-11 00:12:15.797 UTC [7954] LOG: redo starts at 7/D0F956C0
2018-08-11 00:12:16.068 UTC [7954] LOG: consistent recovery state reached at 7/D0FFF088

make me think that there is an issue when the slave reaches the end of the
copied WAL file.  I speculate that the useful content of this WAL segment
ends at FFF088, which is followed by an empty gap due to record sizes.  But
the slave tries to start streaming from this point, D0FFF088, not D1000000.
If the master still had a copy of segment D0 then it would be able to stream
this gap followed by the real content in the current segment D1.

Does that make any sense at all?


Regards, Phil.






--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux