How to prioritise walsender reading from pg_wal over WAL writes?

Alexey Bashtanov <bashtanov@xxxxxxx> · Fri, 13 Nov 2020 10:12:17 +0000

Hello,

I've got a postgres master node that receives a lot of writes, WAL 
written at 100MB/sec or more at times.
And when these load spikes happen streaming replication starts lagging.
It looks like the lag happens on sending stage, and is limited by the 
master pg_wal partition throughput.
It's an SSD RAID-1 but it was the same when it was an HDD RAID-1.
I tried to prioritize reads using deadline scheduler, but even extreme 
values don't change the situation:
iostat shows more bytes written than read, while the device is busy, 
often 90-100%.

Writers connect via a transaction-pooling-mode pgbouncer, so I can tune 
the number of parallel connections.
Replication works fine when I limit it to 3, which is quite low.
It works for me so far, but looks really inflexible to me,
as e.g. I'll have to allocate a separate pgbouncer server pool for less 
eager apps and for humans.

I increased wal_buffers to 256MB hoping that it'll reduce the disk load, 
but it probably works only for
initial lag accumulation, once the lag is there it's not going to help.

ionice'ing walsender to best-effort -2 didn't help either

Any ideas how to prioritize walsender reads over writes from wal writer 
and backends even if there are multiple quite active ones?

The problem happens only occasionally, so if you ask for more details it 
may take some time to reply.
Sorry about this.

Best, Alex