On Fri, 2024-11-08 at 15:20 -0800, Dai Ngo wrote: > Hi Trond, > > Currently cl_tasks is used to maintain the list of all rpc_task's > for each rpc_clnt. > > Under heavy write load, we've seen this list grows to millions > of entries. Even though the list is extremely long, the system > still runs fine until the user wants to get the information of > all active RPC tasks by doing: > > # cat /sys/kernel/debug/sunrpc/rpc_clnt/N/tasks > > When this happens, tasks_start() is called and it acquires the > rpc_clnt.cl_lock to walk the cl_tasks list, returning one entry > at a time to the caller. The cl_lock is held until all tasks on > this list have been processed. > > While the cl_lock is held, completed RPC tasks have to spin wait > in rpc_task_release_client for the cl_lock. If there are millions > of entries in the cl_tasks list it will take a long time before > tasks_stop is called and the cl_lock is released. > > Under heavy load condition the rpc_task_release_client threads > will use up all the available CPUs in the system, preventing other > jobs to run and this causes the system to temporarily lock up. > > I'm looking for suggestions on how to address this issue. I think > one option is to convert the cl_tasks list to use xarray to eliminate > the contention on the cl_lock and would like to get the opinion > from the community. No. We are definitely not going to add a gravity-challenged solution like xarray to solve a corner-case problem of list iteration. Firstly, this is really only a problem for NFSv3 and NFSv4.0 because they don't actually throttle at the NFS layer. Secondly, having millions of entries associated with a single struct rpc_clnt, means living in latency hell, where waking up a sleeping task can mean living on the rpciod queue for several 100ms before execution starts due to the shear volume of tasks in the queue. So IMHO a better question would be: "What is a sensible throttling scheme for NFSv3 and NFSv4.0?" -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx