> On Jun 18, 2024, at 3:29 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Tue, 2024-06-18 at 18:40 +0000, Chuck Lever III wrote: >> >> >>> On Jun 18, 2024, at 2:32 PM, Trond Myklebust >>> <trondmy@xxxxxxxxxxxxxxx> wrote: >>> >>> I recently back ported Neil's lwq code and sunrpc server changes to >>> our >>> 5.15.130 based kernel in the hope of improving the performance for >>> our >>> data servers. >>> >>> Our performance team recently ran a fio workload on a client that >>> was >>> doing 100% NFSv3 reads in O_DIRECT mode over an RDMA connection >>> (infiniband) against that resulting server. I've attached the >>> resulting >>> flame graph from a perf profile run on the server side. >>> >>> Is anyone else seeing this massive contention for the spin lock in >>> __lwq_dequeue? As you can see, it appears to be dwarfing all the >>> other >>> nfsd activity on the system in question here, being responsible for >>> 45% >>> of all the perf hits. >> >> I haven't seen that, but I've been working on other issues. >> >> What's the nfsd thread count on your test server? Have you >> seen a similar impact on 6.10 kernels ? >> > > 640 knfsd threads. The machine was a supermicro 2029BT-HNR with 2xIntel > 6150, 384GB of memory and 6xWDC SN840. > > Unfortunately, the machine was a loaner, so cannot compare to 6.10. > That's why I was asking if anyone has seen anything similar. If this system had more than one NUMA node, then using svc's "numa pool" mode might have helped. -- Chuck Lever