On Fri, 2020-01-03 at 11:47 -0500, Bruce Fields wrote: > On Wed, Dec 18, 2019 at 06:20:56PM -0500, Chuck Lever wrote: > > > On Dec 13, 2019, at 3:12 PM, Trond Myklebust < > > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > Does something like the following help? > > > > > > 8<--------------------------------------------------- > > > From caf515c82ed572e4f92ac8293e5da4818da0c6ce Mon Sep 17 00:00:00 > > > 2001 > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > Date: Fri, 13 Dec 2019 15:07:33 -0500 > > > Subject: [PATCH] nfsd: Fix a soft lockup race in > > > nfsd_file_mark_find_or_create() > > > > > > If nfsd_file_mark_find_or_create() keeps winning the race for the > > > nfsd_file_fsnotify_group->mark_mutex against nfsd_file_mark_put() > > > then it can soft lock up, since fsnotify_add_inode_mark() ends > > > up always finding an existing entry. > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > --- > > > fs/nfsd/filecache.c | 8 ++++++-- > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > index 9c2b29e07975..f275c11c4e28 100644 > > > --- a/fs/nfsd/filecache.c > > > +++ b/fs/nfsd/filecache.c > > > @@ -132,9 +132,13 @@ nfsd_file_mark_find_or_create(struct > > > nfsd_file *nf) > > > struct nfsd_file_mark, > > > nfm_mark)); > > > mutex_unlock(&nfsd_file_fsnotify_group- > > > >mark_mutex); > > > - fsnotify_put_mark(mark); > > > - if (likely(nfm)) > > > + if (nfm) { > > > + fsnotify_put_mark(mark); > > > break; > > > + } > > > + /* Avoid soft lockup race with > > > nfsd_file_mark_put() */ > > > + fsnotify_destroy_mark(mark, > > > nfsd_file_fsnotify_group); > > > + fsnotify_put_mark(mark); > > > } else > > > mutex_unlock(&nfsd_file_fsnotify_group- > > > >mark_mutex); > > > > > > > I've tried to reproduce the lockup for three days with this patch > > applied to my server. No lockup. > > > > Tested-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > > I'm applying this for 5.5 with Chuck's tested-by and: > > Fixes: 65294c1f2c5e "nfsd: add a new struct file caching facility > to nfsd" > I've got more coming. We've been doing data integrity tests and have hit some issues around error reporting that need to be fixed in knfsd in order to avoid silent corruption of data. I'll be sending a bulk patchset for 5.5 soon. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx