Hello,
I've got a postgres master node that receives a lot of writes, WAL
written at 100MB/sec or more at times.
And when these load spikes happen streaming replication starts lagging.
It looks like the lag happens on sending stage, and is limited by the
master pg_wal partition throughput.
It's an SSD RAID-1 but it was the same when it was an HDD RAID-1.
I tried to prioritize reads using deadline scheduler, but even extreme
values don't change the situation:
iostat shows more bytes written than read, while the device is busy,
often 90-100%.
Writers connect via a transaction-pooling-mode pgbouncer, so I can tune
the number of parallel connections.
Replication works fine when I limit it to 3, which is quite low.
It works for me so far, but looks really inflexible to me,
as e.g. I'll have to allocate a separate pgbouncer server pool for less
eager apps and for humans.
I increased wal_buffers to 256MB hoping that it'll reduce the disk load,
but it probably works only for
initial lag accumulation, once the lag is there it's not going to help.
ionice'ing walsender to best-effort -2 didn't help either
Any ideas how to prioritize walsender reads over writes from wal writer
and backends even if there are multiple quite active ones?
The problem happens only occasionally, so if you ask for more details it
may take some time to reply.
Sorry about this.
Best, Alex