On Thu, 07 Apr 2022, Dave Chinner wrote: > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote: > > In the last couple days I've started getting hangs on xfstests > > generic/186 on upstream. I also notice the test completes after 10+ > > hours (usually it takes about 5 minutes). Sometimes this is accompanied > > by "nfs: RPC call returned error 12" on the client. > > #define ENOMEM 12 /* Out of memory */ > > So either the client or the server is running out of memory > somewhere? Probably the client. There are a bunch of changes recently which add __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because that can result in deadlocks when swapping over NFS. This means that kmalloc request that previously never failed (because GFP_KERNEL never fails for kernel threads I think) can now fail. This has tickled one bug that I know of. There are likely to be more. The RPC code should simply retry these allocations after a short delay. HZ/4 is the number that is used in a couple of places. Possibly there are more places that need to handle -ENOMEM with rpc_delay(). NeilBrown