2015-05-25 15:15 GMT+02:00 Piotr Gasidło <quaker@xxxxxxxxxxxxxx>:
--
2015-05-25 11:30 GMT+02:00 Guillaume Lelarge <guillaume@xxxxxxxxxxxx>:
>> I currently have wal_keep_segments set to 0.
>> Setting this to higher value will help? As I understand: master won't
>> delete segment and could stream it to slave on request - so it will
>> help.
>
>
> It definitely helps, but the issue could still happen.
>
What conditions must be met for issue to happen?
Very high WAL traffic can make the slave lag enough that even wal_keep_segments isn't enough.
Both archive_command on master and restore_commands are set and working.
Also wal_keep_segments is set.
I see no point of failure - only delay in the case of high WAL traffic
on master:
- slave starts with restoring WALs from archive,
- now, it connects to master and notices, that for last master's WAL
it needs previous one ("the issue"),
- slave asks master for previous WAL and gets it - job done, streaming
replication set, exit
- if unable to get it (WAL traffic is high, and after restoring last
WAL from archive and asking master for next one more than
wal_keep_segments were recycled) it returns to looking WALs in
archive.
Do I get it right?
Yes. If you set correctly archive_command (on the master) and restore_command (on the slave), there's no point of failure. You might still get the "WAL not available" error message, but the slave can synchronize itself with the archived WALs.
--