On Wed, 2021-11-24 at 17:06 -0500, bfields@xxxxxxxxxxxx wrote: > On Wed, Nov 24, 2021 at 05:14:53PM +0000, Trond Myklebust wrote: > > It is a little nasty that we hide the list_del() calls in several > > levels of function call, so they probably do deserve a comment. > > > > That said, if, as in the case here, the delegation was unhashed, we > > still end up not calling list_del_init() in > > unhash_delegation_locked(), > > and since the list_add() is not conditional on it being successful, > > the > > global list is again corrupted. > > > > Yes, it is an unlikely race, but it is possible despite your > > change. > > Thanks, good point. > > Probably not something anyone's actually hitting, but another sign > this > logic need rethinking. > I think it should be sufficient to let the laundromat skip that entry and leave it on the list if the unhash_delegation_locked() fails, since your fix should then be able to pick the delegation up and destroy it safely. We can keep the code in __destroy_client() and nfs4_state_shutdown_net() unchanged, since those are presumably not affected by this race. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx