Hi, On Tue, Aug 15, 2023 at 2:21 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Tue, 2023-08-15 at 13:49 -0400, Jeff Layton wrote: > > On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote: > > > This patch fixes a race in async lock request handling between adding > > > the relevant struct nlm_block to nlm_blocked list after the request was > > > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the > > > nlm_block in the nlm_blocked list. It could be that the async request is > > > completed before the nlm_block was added to the list. This would end > > > in a -ENOENT and a kernel log message of "lockd: grant for unknown > > > block". > > > > > > To solve this issue we add the nlm_block before the vfs_lock_file() call > > > to be sure it has been added when a possible nlmsvc_grant_deferred() is > > > called. If the vfs_lock_file() results in an case when it wouldn't be > > > added to nlm_blocked list, the nlm_block struct will be removed from > > > this list again. > > > > > > The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle > > > async lock requests on a pending lock requests as a retry on the caller > > > level to hit the -EAGAIN case. > > > > > > Signed-off-by: Alexander Aring <aahringo@xxxxxxxxxx> > > > --- > > > fs/lockd/svclock.c | 100 ++++++++++++++++++++++++++---------- > > > include/linux/lockd/lockd.h | 2 + > > > 2 files changed, 74 insertions(+), 28 deletions(-) > > > > > > > > [...] > > > > diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h > > > index f42594a9efe0..91f55458f5fc 100644 > > > --- a/include/linux/lockd/lockd.h > > > +++ b/include/linux/lockd/lockd.h > > > @@ -185,10 +185,12 @@ struct nlm_block { > > > struct nlm_file * b_file; /* file in question */ > > > struct cache_req * b_cache_req; /* deferred request handling */ > > > struct cache_deferred_req * b_deferred_req > > > + struct mutex b_cb_mutex; /* callback mutex */ > > > > There is no mention at all of this new mutex in the changelog or > > comments. It's not at all clear to me what this is intended to protect. > > In general, with lockd being a single-threaded service, we want to avoid > > sleeping locks. This will need some clear justification. > > > > At a glance, it looks like you're trying to use this to hold > > B_PENDING_CALLBACK steady while a lock request is being handled. That > > suggests that you're using this mutex to serialize access to a section > > of code and not one or more specific data structures. We usually like to > > avoid that sort of thing, since locks that protect arbitrary sections of > > code become difficult to work with over time. > > > > I'm going to go out on a limb here though and suggest that there is > > probably a way to solve this problem that doesn't involve adding new > > locks. > > > > > unsigned int b_flags; /* block flags */ > > > #define B_QUEUED 1 /* lock queued */ > > > #define B_GOT_CALLBACK 2 /* got lock or conflicting lock */ > > > #define B_TIMED_OUT 4 /* filesystem too slow to respond */ > > > +#define B_PENDING_CALLBACK 8 /* pending callback for lock request */ > > > }; > > > > > > /* > > > > Do we need this new flag at all? It seems redundant. If we have a block > > on the list, then it is sort of by definition "pending callback". If > > it's not on the list anymore, then it's not. No? > > > > Do we need anything more than a patch along these lines? Note that this > is untested, so RFC: > > ---------------------8<----------------------- > > [RFC PATCH] lockd: alternate fix for race between deferred lock and grant > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > fs/lockd/svclock.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c > index c43ccdf28ed9..e9a84363c26e 100644 > --- a/fs/lockd/svclock.c > +++ b/fs/lockd/svclock.c > @@ -446,6 +446,8 @@ nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block) > > block->b_flags |= B_QUEUED; > > + /* FIXME: remove and reinsert w/o dropping spinlock */ > + nlmsvc_remove_block(block); > nlmsvc_insert_block(block, NLM_TIMEOUT); > a insert should just be okay, because there is an atomic switch if it's already part of nlm_blocked and it will just update the timeout of nlm_block in the list and it's order (because nlm_blocked is kind of sorted according their timeouts in nlm_blocked). > block->b_cache_req = &rqstp->rq_chandle; > @@ -535,6 +537,9 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, > if (!wait) > lock->fl.fl_flags &= ~FL_SLEEP; > mode = lock_to_openmode(&lock->fl); > + > + /* Append to list of blocked */ > + nlmsvc_insert_block(block, NLM_NEVER); > error = vfs_lock_file(file->f_file[mode], F_SETLK, &lock->fl, NULL); > lock->fl.fl_flags &= ~FL_SLEEP; > > @@ -542,6 +547,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, > switch (error) { > case 0: > ret = nlm_granted; > + nlmsvc_remove_block(block); > goto out; > case -EAGAIN: > /* > @@ -552,6 +558,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, > if (wait) > break; > ret = async_block ? nlm_lck_blocked : nlm_lck_denied; > + nlmsvc_remove_block(block); > goto out; > case FILE_LOCK_DEFERRED: > if (wait) > @@ -570,8 +577,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, > > ret = nlm_lck_blocked; > > - /* Append to list of blocked */ > - nlmsvc_insert_block(block, NLM_NEVER); ok. I will try to start with that. - Alex