I would first determine where the bottleneck is.
Is it really the walsender, or is it on the network or in the standby server's replay?
It is really the walsender, and it really is the performance of the WAL
storage on the master.
Check the difference between "sent_lsn", "replay_lsn" from "pg_stat_replication" and
pg_current_wal_lsn() on the primary.
Yes I've checked these numbers, the lagging one is sent_lsn.
It doesn't look like it's hitting network capacity either.
When we moved it to an NVMe as a short-term solution it worked fine.
Best, Alex