On Thu, 2020-11-05 at 11:12 +1100, NeilBrown wrote: > > Hi, > I have a customer report of NFS getting stuck due to a workqueue > lockup. > > This appears to be triggered by calling 'close' on a 5TB file. > The rpc_release set up by nfs4_do_close() calls a final iput() > on the inode which leads to nfs4_evict_inode() which calls > truncate_inode_pages_final(). On a 5TB file, this can take a little > while. > > Documentation for workqueue says > Generally, work items are not expected to hog a CPU and consume > many > cycles. > > so maybe that isn't a good idea. > truncate_inode_pages_final() does call cond_resched(), but workqueue > doesn't take notice of that. By default it only runs more threads > on > the same CPU if the first thread actually sleeps. So the net result > is > that there are lots of rpc_async_schedule tasks queued up behind the > iput, waiting for it to finish rather than running concurrently. > > I believe this can be fixed by setting WQ_CPU_INTENSIVE on the > nfsiod > workqueue. This flag causes the workqueue core to schedule more > threads as needed even if the existing threads never sleep. > I don't know if this is a good idea as it might spans lots of > threads > needlessly when rpc_release functions don't have lots of work to do. > > Another option might be to create a separate > nfsiod_intensive_workqueue > with this flag set, and hand all iput requests over to this > workqueue. > > I've asked for the customer to test with this simple patch. > > Any thoughts or suggestions most welcome, > Isn't this a general problem (i.e. one that is not specific to NFS) when you have multi-terabyte caches? Why wouldn't io_uring be vulnerable to the same issue, for instance? The thing is that truncate_inode_pages() has plenty of latency reducing cond_sched() calls that should ensure that other threads get scheduled, so this problem doesn't strictly meet the 'CPU intensive' criterion that I understand WQ_CPU_INTENSIVE to be designed for. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx