Hey, On Fri, Jun 23, 2023 at 02:37:17PM +0000, Chuck Lever III wrote: > I'm using NFS/RDMA for my test because I can drive more IOPS with it. > > I've found that setting the nfsiod and rpciod workqueues to "cpu" > scope provide the best benefit for this workload. Changing the > xprtiod workqueue to "cpu" had no discernible effect. > > This tracks with the number of queue_work calls for each of these > WQs. 59% of queue_work calls during the test are for the rpciod > WQ, 21% are for nfsiod, and 2% is for xprtiod. > > The same test with TCP (using IP-over-IB on the same physical network) > shows no improvement on any test. That suggests there is a bottleneck > somewhere else, when using TCP, that limits its throughput. Yeah, you can make the necessary workqueues to default to CPU or SMT scope using apply_workqueue_attrs(). The interface a bit cumbersome and we probably wanna add convenience helpers to switch e.g. affinity scopes but it's still just several lines of code. Thanks. -- tejun