On Mon, Oct 21, 2024 at 04:25:40PM +0800, Yang Erkun wrote: > In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the > function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will > release all resources related to the hashed `nfs4_client`. If the > `nfsd_client_shrinker` is running concurrently, the `expire_client` > function will first unhash this client and then destroy it. This can > lead to the following warning. Additionally, numerous use-after-free > errors may occur as well. > > nfsd_client_shrinker echo 0 > /proc/fs/nfsd/threads > > expire_client nfsd_shutdown_net > unhash_client ... > nfs4_state_shutdown_net > /* won't wait shrinker exit */ > /* cancel_work(&nn->nfsd_shrinker_work) > * nfsd_file for this /* won't destroy unhashed client1 */ > * client1 still alive nfs4_state_destroy_net > */ > > nfsd_file_cache_shutdown > /* trigger warning */ > kmem_cache_destroy(nfsd_file_slab) > kmem_cache_destroy(nfsd_file_mark_slab) > /* release nfsd_file and mark */ > __destroy_client > > ==================================================================== > BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on > __kmem_cache_shutdown() > -------------------------------------------------------------------- > CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1 > > dump_stack_lvl+0x53/0x70 > slab_err+0xb0/0xf0 > __kmem_cache_shutdown+0x15c/0x310 > kmem_cache_destroy+0x66/0x160 > nfsd_file_cache_shutdown+0xac/0x210 [nfsd] > nfsd_destroy_serv+0x251/0x2a0 [nfsd] > nfsd_svc+0x125/0x1e0 [nfsd] > write_threads+0x16a/0x2a0 [nfsd] > nfsctl_transaction_write+0x74/0xa0 [nfsd] > vfs_write+0x1a5/0x6d0 > ksys_write+0xc1/0x160 > do_syscall_64+0x5f/0x170 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > ==================================================================== > BUG nfsd_file_mark (Tainted: G B W ): Objects remaining > nfsd_file_mark on __kmem_cache_shutdown() > -------------------------------------------------------------------- > > dump_stack_lvl+0x53/0x70 > slab_err+0xb0/0xf0 > __kmem_cache_shutdown+0x15c/0x310 > kmem_cache_destroy+0x66/0x160 > nfsd_file_cache_shutdown+0xc8/0x210 [nfsd] > nfsd_destroy_serv+0x251/0x2a0 [nfsd] > nfsd_svc+0x125/0x1e0 [nfsd] > write_threads+0x16a/0x2a0 [nfsd] > nfsctl_transaction_write+0x74/0xa0 [nfsd] > vfs_write+0x1a5/0x6d0 > ksys_write+0xc1/0x160 > do_syscall_64+0x5f/0x170 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > To resolve this issue, cancel `nfsd_shrinker_work` using synchronous > mode in nfs4_state_shutdown_net. > > Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition") I think like: Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker") a little better. I'm very curious how you stumbled across this one! > Signed-off-by: Yang Erkun <yangerkun@xxxxxxxxxxxxxxx> > --- > fs/nfsd/nfs4state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 56b261608af4..158d67186777 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -8684,7 +8684,7 @@ nfs4_state_shutdown_net(struct net *net) > struct nfsd_net *nn = net_generic(net, nfsd_net_id); > > shrinker_free(nn->nfsd_client_shrinker); > - cancel_work(&nn->nfsd_shrinker_work); > + cancel_work_sync(&nn->nfsd_shrinker_work); > cancel_delayed_work_sync(&nn->laundromat_work); > locks_end_grace(&nn->nfsd4_manager); > > -- > 2.39.2 > -- Chuck Lever