Postgres 9.6.6. Primary has a local (HA) replica and a remote (DR) replica.
However I'd like to know if there are any optimal networking settings on the host or network that we maybe missing. My manager says that the circuit between data centers was only 60% utilized at its peak.
In the past I've tried increasing wal_keep_files, which keeps the WAL files available for streaming but the fact remains that they stream very slowly so the lag just gets worse than if we fell back to archives every 30 minutes or so.
I have no basis for this other than my previous experience with Oracle physical standbys, but I would think that streaming replication should be able to push more than it seems to be doing in my prod environment. The fact that the local replica keeps up just fine without breaking streaming replication tells me that the problem is in the cross-datacenter circuit, not in postgres recovery performance.
If anyone has any advice on host networking setup, tuning or testing, I'd love to hear it.
Thanks,
Don.