Warm standby can't start because logs stream too quickly from the master

Zach Walton <zacwalt@xxxxxxxxx> · Sat, 2 Dec 2017 13:02:00 -0600

Looking at the startup process:
postgres 16749  4.1  6.7 17855104 8914544 ?    Ss   18:36   0:44 postgres: startup process   recovering 0000000800005B1C00000030

Then a few seconds later:

postgres 16749  4.2  7.0 17855104 9294172 ?    Ss   18:36   0:47 postgres: startup process   recovering 0000000800005B1C00000047

It's replaying logs from the master, but it's always a few behind, so startup never finishes. Here's a demonstration:

# while :; do echo $(ls data/pg_xlog/ | grep -n $(ps aux | egrep "startup process" | awk '{print $15}')) $(ls data/pg_xlog/ | wc -l); sleep 1; done
# current replay location                     # number of WALs in pg_xlog
1655:0000000800005B1C00000064 1659
1656:0000000800005B1C00000065 1660
1658:0000000800005B1C00000067 1661
1659:0000000800005B1C00000068 1662
1660:0000000800005B1C00000069 1663

Generally this works itself out if I wait (sometimes a really long time). Is there a configuration option that allows a warm standby to start without having fully replayed the logs from the master?

* Note: wal_keep_segments is set to 8192 on these servers, which have large disks, to allow for recovery within a couple of hours of a failover without resorting to restoring from archive
* This is specifically an issue for pgpool recovery, which fails if a standby can't start within (by default) 300 seconds. Open to toggling that param if there's no way around this.