On Mon, 2021-12-06 at 12:36 -0800, dai.ngo@xxxxxxxxxx wrote: > > On 12/6/21 12:05 PM, bfields@xxxxxxxxxxxx wrote: > > On Mon, Dec 06, 2021 at 07:52:29PM +0000, Trond Myklebust wrote: > > > On Mon, 2021-12-06 at 18:39 +0000, Chuck Lever III wrote: > > > > > > > > > On Dec 6, 2021, at 12:59 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> > > > > > wrote: > > > > > > > > > > Add new callback, lm_expire_lock, to lock_manager_operations > > > > > to > > > > > allow > > > > > the lock manager to take appropriate action to resolve the > > > > > lock > > > > > conflict > > > > > if possible. The callback takes 2 arguments, file_lock of the > > > > > blocker > > > > > and a testonly flag: > > > > > > > > > > testonly = 1 check and return true if lock conflict can be > > > > > resolved > > > > > else return false. > > > > > testonly = 0 resolve the conflict if possible, return true > > > > > if > > > > > conflict > > > > > was resolved esle return false. > > > > > > > > > > Lock manager, such as NFSv4 courteous server, uses this > > > > > callback to > > > > > resolve conflict by destroying lock owner, or the NFSv4 > > > > > courtesy > > > > > client > > > > > (client that has expired but allowed to maintains its states) > > > > > that > > > > > owns > > > > > the lock. > > > > > > > > > > Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx> > > > > Al, Jeff, as co-maintainers of record for fs/locks.c, can you > > > > give > > > > an Ack or Reviewed-by? I'd like to take this patch through the > > > > nfsd > > > > tree for v5.17. Thanks for your time! > > > > > > > > > > > > > --- > > > > > fs/locks.c | 28 +++++++++++++++++++++++++--- > > > > > include/linux/fs.h | 1 + > > > > > 2 files changed, 26 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/fs/locks.c b/fs/locks.c > > > > > index 3d6fb4ae847b..0fef0a6322c7 100644 > > > > > --- a/fs/locks.c > > > > > +++ b/fs/locks.c > > > > > @@ -954,6 +954,7 @@ posix_test_lock(struct file *filp, struct > > > > > file_lock *fl) > > > > > struct file_lock *cfl; > > > > > struct file_lock_context *ctx; > > > > > struct inode *inode = locks_inode(filp); > > > > > + bool ret; > > > > > > > > > > ctx = smp_load_acquire(&inode->i_flctx); > > > > > if (!ctx || list_empty_careful(&ctx->flc_posix)) { > > > > > @@ -962,11 +963,20 @@ posix_test_lock(struct file *filp, > > > > > struct > > > > > file_lock *fl) > > > > > } > > > > > > > > > > spin_lock(&ctx->flc_lock); > > > > > +retry: > > > > > list_for_each_entry(cfl, &ctx->flc_posix, fl_list) { > > > > > - if (posix_locks_conflict(fl, cfl)) { > > > > > - locks_copy_conflock(fl, cfl); > > > > > - goto out; > > > > > + if (!posix_locks_conflict(fl, cfl)) > > > > > + continue; > > > > > + if (cfl->fl_lmops && cfl->fl_lmops- > > > > > >lm_expire_lock > > > > > && > > > > > + cfl->fl_lmops- > > > > > >lm_expire_lock(cfl, > > > > > 1)) { > > > > > + spin_unlock(&ctx->flc_lock); > > > > > + ret = cfl->fl_lmops- > > > > > >lm_expire_lock(cfl, > > > > > 0); > > > > > + spin_lock(&ctx->flc_lock); > > > > > + if (ret) > > > > > + goto retry; > > > > > } > > > > > + locks_copy_conflock(fl, cfl); > > > How do you know 'cfl' still points to a valid object after you've > > > dropped the spin lock that was protecting the list? > > Ugh, good point, I should have noticed that when I suggested this > > approach.... > > > > Maybe the first call could instead return return some reference- > > counted > > object that a second call could wait on. > > > > Better, maybe it could add itself to a list of such things and then > > we > > could do this in one pass. > > I think we adjust this logic a little bit to cover race condition: > > The 1st call to lm_expire_lock returns the client needs to be > expired. > > Before we make the 2nd call, we save the 'lm_expire_lock' into a > local > variable then drop the spinlock, and use the local variable to make > the > 2nd call so that we do not reference 'cfl'. The argument of the > second > is the opaque return value from the 1st call. > > nfsd4_fl_expire_lock also needs some adjustment to support the above. > It's not just the fact that you're using 'cfl' in the actual call to lm_expire_lock(), but you're also using it after retaking the spinlock. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx