Re: [PATCH] lockd: set other missing fields when unlocking files

Jeff Layton <jlayton@xxxxxxxxxx> · Tue, 08 Nov 2022 13:51:00 -0500

On Tue, 2022-11-08 at 16:52 +0000, Chuck Lever III wrote:
> 
> > On Nov 8, 2022, at 11:41 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > On Tue, 2022-11-08 at 14:57 +0000, Chuck Lever III wrote:
> > > 
> > > > On Nov 7, 2022, at 4:55 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > > > 
> > > > > On Nov 7, 2022, at 5:48 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > 
> > > > > On Sun, 2022-11-06 at 14:02 -0500, trondmy@xxxxxxxxxx wrote:
> > > > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > > > > > 
> > > > > > vfs_lock_file() expects the struct file_lock to be fully initialised by
> > > > > > the caller.
> > > > 
> > > > As a reviewer, I don't see anything in the vfs_lock_file() kdoc
> > > > comment that suggests this, and vfs_lock_file() itself is just
> > > > a wrapper around each filesystem's f_ops->lock method. That
> > > > expectation is a bit deeper into NFS-specific code. A few more
> > > > observations below.
> > > > 
> > > > 
> > > > > > Re-exported NFSv3 has been seen to Oops if the fl_file field
> > > > > > is NULL.
> > > > 
> > > > Needs a Link: to the bug report. Which I can add.
> > > > 
> > > > This will also give us a call trace we can reference, so I won't
> > > > add that here.
> > > > 
> > > > 
> > > > > > Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files")
> > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > > > > > ---
> > > > > > fs/lockd/svcsubs.c | 17 ++++++++++-------
> > > > > > 1 file changed, 10 insertions(+), 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
> > > > > > index e1c4617de771..3515f17eaf3f 100644
> > > > > > --- a/fs/lockd/svcsubs.c
> > > > > > +++ b/fs/lockd/svcsubs.c
> > > > > > @@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)
> > > > > > 	}
> > > > > > }
> > > > > > 
> > > > > > -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
> > > > > > +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl)
> > > > > > {
> > > > > > 	struct file_lock lock;
> > > > > > 
> > > > > > @@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
> > > > > > 	lock.fl_type  = F_UNLCK;
> > > > > > 	lock.fl_start = 0;
> > > > > > 	lock.fl_end   = OFFSET_MAX;
> > > > > > -	lock.fl_owner = owner;
> > > > > > -	if (file->f_file[O_RDONLY] &&
> > > > > > -	    vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL))
> > > > > > +	lock.fl_owner = fl->fl_owner;
> > > > > > +	lock.fl_pid   = fl->fl_pid;
> > > > > > +	lock.fl_flags = FL_POSIX;
> > > > > > +
> > > > > > +	lock.fl_file = file->f_file[O_RDONLY];
> > > > > > +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
> > > > > > 		goto out_err;
> > > > > > -	if (file->f_file[O_WRONLY] &&
> > > > > > -	    vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL))
> > > > > > +	lock.fl_file = file->f_file[O_WRONLY];
> > > > > > +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
> > > > > > 		goto out_err;
> > > > > > 	return 0;
> > > > > > out_err:
> > > > > > @@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
> > > > > > 		if (match(lockhost, host)) {
> > > > > > 
> > > > > > 			spin_unlock(&flctx->flc_lock);
> > > > > > -			if (nlm_unlock_files(file, fl->fl_owner))
> > > > > > +			if (nlm_unlock_files(file, fl))
> > > > > > 				return 1;
> > > > > > 			goto again;
> > > > > > 		}
> > > > > 
> > > > > Good catch.
> > > > > 
> > > > > I wonder if we ought to roll an initializer function for file_locks to
> > > > > make it harder for callers to miss setting some fields like this? One
> > > > > idea: we could change vfs_lock_file to *not* take a file argument, and
> > > > > insist that the caller fill out fl_file when calling it? That would make
> > > > > it harder to screw this up.
> > > > 
> > > > Commit history shows that, at least as far back as the beginning of
> > > > the git era, the vfs_lock_file() call site here did not initialize
> > > > the fl_file field. So, this code has been working without fully
> > > > initializing @fl for, like, forever.
> > > > 
> > > > 
> > > > Trond later says:
> > > > > The regression occurs in 5.16, because that was when Bruce merged his
> > > > > patches to enable locking when doing NFS re-exporting.
> > > > 
> > > > That means the Fixes: tag above is misleading. The proposed patch
> > > > doesn't actually fix that commit (which went into v5.19), it simply
> > > > applies on that commit.
> > > > 
> > > > I haven't been able to find the locking patches mentioned here. I think
> > > > those bear mentioning (by commit ID) in the patch description, at least.
> > > > If you know the commit ID, Trond, can you pass it along?
> > > > 
> > > > Though I would say that, in agreement with Jeff, the true cause of this
> > > > issue is the awkward synopsis for vfs_lock_file().
> > > 
> > > Since Trond has re-assigned the kernel.org bug to me... I'll blather on
> > > a bit more. (Yesterday's patch is still queued up, I can replace it or
> > > move it depending on the outcome of this discussion).
> > > 
> > > -> The vfs_{test,lock,cancel}_file APIs all take a file argument. Maybe
> > > we shouldn't remove the @filp argument from vfs_lock_file().
> > > 
> > 
> > They all take a file_lock argument as well. @filp is redundant in all of
> > them. Keeping both just increases the ambiguity. I move that we drop the
> > explicit argument since we need to set it in the struct anyway.
> 
> Sounds good to me.
> 
> 
> > We could also consider adding a @filp arguments to locks_alloc_lock and
> > locks_init_lock, to make it a bit more evident that it needs to be set.
> > 
> > > -> The struct file_lock * argument of vfs_lock_file() is not a const.
> > > 
> > 
> > That might be tough. Even for "request" fl's we modify some fields in
> > them (for example, fl_wait and fl_blocked_member). fl_file should never
> > change though, once it has been assigned. We could potentially make that
> > const.
> > 
> > > After auditing the call sites, I think it would be safe for vfs_lock_file()
> > > to explicitly overwrite the fl->fl_file field with the value of the @filp
> > > argument before calling f_ops->lock. At the very least, it should sanity-
> > > check that the two pointer values are the same, and document that as an
> > > API requirement.
> > > 
> > > Alternatively we could cook up an NFS-specific fix... but the vfs_lock_file
> > > API would still look dodgy.
> > > 
> > 
> > I see no reason to do anything NFS-specific here. I'd be fine with
> > WARN_ONs in locks.c for now, until we decide what to do longer term.
> > It's possible we have some other call chains that are not setting that
> > field correctly.
> 
> Agreed, a WARN_ON would be a good first step.
> 
> 

Yep. There aren't that many callers either, so I'll plan to do a sweep
and look for any that aren't setting it now. Once I do that, I'll add
the WARN_ON to catch any I missed.

> > If we can audit all of the call sites and ensure that they are properly
> > setting fl_file in the struct, we should be able to painlessly drop the
> > separate @filp argument from all of those functions.
> 
> The only one I found that doesn't set fl_file close to the vfs_lock_file
> call site is do_lock_file_wait().
> 

I'll look at that. At a glance though, it looks like its callers all set
fl_file (mostly via the struct flock conversion routines). We can
probably eliminate the filp argument on that too.

> 
> > I'll toss it onto my to-do pile.
> 
> I'm assuming you mean you'll do the API clean-up, and that I should
> keep Trond's fix in the nfsd queue.
> 

Yes. I'll plan to go through it here in the near future. In the
meantime, we'll still need Trond's fix for lockd.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>