Hi,
Looking for any tips here on how to best maintain a replication slave which is operating under some latency between networks - around 230ms. On a good day/week, replication will keep up for a number of days, but however, when the link is under higher than average usage, keeping replication active can last merely minutes before falling behind again.
2018-07-24 18:46:14 GMTLOG: database system is ready to accept read only connections
2018-07-24 18:46:15 GMTLOG: started streaming WAL from primary at 2B/93000000 on timeline 1
2018-07-24 18:59:28 GMTLOG: incomplete startup packet
2018-07-24 19:15:36 GMTLOG: incomplete startup packet
2018-07-24 19:15:36 GMTLOG: incomplete startup packet
2018-07-24 19:15:37 GMTLOG: incomplete startup packet
As you can see above, it lasted about half an hour before falling out of sync.
On the master, I have wal_keep_segments=128. What is happening when I see "incomplete startup packet" - is it simply the slave has fallen behind, and cannot 'catch up' using the wal segments quick enough? I assume the slave is using the wal segments to replay changes and assuming there are enough wal segments to cover the period it cannot stream properly, it will eventually recover?