----- On 14 Feb, 2021, at 16:59, Chuck Lever chuck.lever@xxxxxxxxxx wrote: >>>> I don't have a performance system to measure the improvement >>>> accurately. >>> >>> Then let's have Daire try it out, if possible. >> >> I'm happy to test it out on one of our 2 x 40G NFS servers with 100 x 1G clients >> (but it's trickier to patch the clients too atm). > > Yes, that's exactly what we need. Thank you! > >> Just so I'm clear, this is in addition to Chuck's "Handle TCP socket sends with >> kernel_sendpage() again" patch from bz #209439 (which I think is now in 5.11 >> rc)? Or you want to see what this patch looks like on it's own without that >> (e.g. v5.10)? > > Please include the "Handle TCP socket sends with kernel_sendpage() again" fix. > Or, you can pull a recent stable kernel, I think that fix is already in there. I took v5.10.16 and used a ~100Gbit capable server with ~150 x 1 gbit clients all reading the same file from the server's pagecache as the test. Both with and without the patch, I consistently see around 90gbit worth of sends from the server for sustained periods. Any differences between them are well within margins of error for repeat runs of the benchmark. The only noticeable difference is in the output of perf top where svc_xprt_do_enqueue goes from ~0.9% without the patch to ~3% with the patch. It now takes up second place (up from 17th place) behind native_queued_spin_lock_slowpath: 3.57% [kernel] [k] native_queued_spin_lock_slowpath 3.07% [kernel] [k] svc_xprt_do_enqueue I also don't really see much difference in the softirq cpu usage. So there doesn't seem to be any negative impacts of the patch but because I'm already pushing the server to it's network hardware limit (without the patch), it's also not clear if it is improving performance for this benchmark either. I also tried with 50 clients and sustained the expected 50gbit sends from the server in both with and without the patch. Daire