On Fri, Nov 01, 2024 at 08:28:42AM -0400, Jeff Layton wrote: > On Thu, 2024-10-31 at 09:40 -0400, cel@xxxxxxxxxx wrote: > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > > nfsd4_shutdown_copy() is just this: > > > > while ((copy = nfsd4_get_copy(clp)) != NULL) > > nfsd4_stop_copy(copy); > > > > nfsd4_get_copy() bumps @copy's reference count, preventing > > nfsd4_stop_copy() from releasing @copy. > > > > A while loop like this usually works by removing the first element > > of the list, but neither nfsd4_get_copy() nor nfsd4_stop_copy() > > alters the async_copies list. > > > > Best I can tell, then, is that nfsd4_shutdown_copy() continues to > > loop until other threads manage to remove all the items from this > > list. The spinning loop blocks shutdown until these items are gone. > > > > Possibly the reason we haven't seen this issue in the field is > > because client_has_state() prevents __destroy_client() from calling > > nfsd4_shutdown_copy() if there are any items on this list. In a > > subsequent patch I plan to remove that restriction. > > > > Fixes: e0639dc5805a ("NFSD introduce async copy feature") > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > > --- > > fs/nfsd/nfs4proc.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > index 9f6617fa5412..8229bbfdd735 100644 > > --- a/fs/nfsd/nfs4proc.c > > +++ b/fs/nfsd/nfs4proc.c > > @@ -1302,6 +1302,9 @@ static struct nfsd4_copy *nfsd4_get_copy(struct nfs4_client *clp) > > copy = list_first_entry(&clp->async_copies, struct nfsd4_copy, > > copies); > > refcount_inc(©->refcount); > > + copy->cp_clp = NULL; > > + if (!list_empty(©->copies)) > > + list_del_init(©->copies); > > } > > spin_unlock(&clp->async_lock); > > return copy; > > My initial thought was: > > The problem sounds real, but is nfsd4_get_copy() the place for the > above logic? > > But then I noticed that nfsd4_get_copy() is only called from > nfsd4_shutdown_copy(). Maybe we should be rename nfsd4_get_copy() to > nfsd4_find_and_unhash_copy() ? nfsd4_get_copy() book-ends nfsd4_put_copy(). But nfsd4_unhash_copy() would work too, I think. -- Chuck Lever