Re: [PATCH] lockd: set other missing fields when unlocking files

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Mon, 7 Nov 2022 21:55:37 +0000

> On Nov 7, 2022, at 5:48 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> On Sun, 2022-11-06 at 14:02 -0500, trondmy@xxxxxxxxxx wrote:
>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>> 
>> vfs_lock_file() expects the struct file_lock to be fully initialised by
>> the caller.

As a reviewer, I don't see anything in the vfs_lock_file() kdoc
comment that suggests this, and vfs_lock_file() itself is just
a wrapper around each filesystem's f_ops->lock method. That
expectation is a bit deeper into NFS-specific code. A few more
observations below.

>> Re-exported NFSv3 has been seen to Oops if the fl_file field
>> is NULL.

Needs a Link: to the bug report. Which I can add.

This will also give us a call trace we can reference, so I won't
add that here.

>> Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files")
>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>> ---
>> fs/lockd/svcsubs.c | 17 ++++++++++-------
>> 1 file changed, 10 insertions(+), 7 deletions(-)
>> 
>> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
>> index e1c4617de771..3515f17eaf3f 100644
>> --- a/fs/lockd/svcsubs.c
>> +++ b/fs/lockd/svcsubs.c
>> @@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)
>> 	}
>> }
>> 
>> -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>> +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl)
>> {
>> 	struct file_lock lock;
>> 
>> @@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>> 	lock.fl_type  = F_UNLCK;
>> 	lock.fl_start = 0;
>> 	lock.fl_end   = OFFSET_MAX;
>> -	lock.fl_owner = owner;
>> -	if (file->f_file[O_RDONLY] &&
>> -	    vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL))
>> +	lock.fl_owner = fl->fl_owner;
>> +	lock.fl_pid   = fl->fl_pid;
>> +	lock.fl_flags = FL_POSIX;
>> +
>> +	lock.fl_file = file->f_file[O_RDONLY];
>> +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>> 		goto out_err;
>> -	if (file->f_file[O_WRONLY] &&
>> -	    vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL))
>> +	lock.fl_file = file->f_file[O_WRONLY];
>> +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>> 		goto out_err;
>> 	return 0;
>> out_err:
>> @@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
>> 		if (match(lockhost, host)) {
>> 
>> 			spin_unlock(&flctx->flc_lock);
>> -			if (nlm_unlock_files(file, fl->fl_owner))
>> +			if (nlm_unlock_files(file, fl))
>> 				return 1;
>> 			goto again;
>> 		}
> 
> Good catch.
> 
> I wonder if we ought to roll an initializer function for file_locks to
> make it harder for callers to miss setting some fields like this? One
> idea: we could change vfs_lock_file to *not* take a file argument, and
> insist that the caller fill out fl_file when calling it? That would make
> it harder to screw this up.

Commit history shows that, at least as far back as the beginning of
the git era, the vfs_lock_file() call site here did not initialize
the fl_file field. So, this code has been working without fully
initializing @fl for, like, forever.

Trond later says:
> The regression occurs in 5.16, because that was when Bruce merged his
> patches to enable locking when doing NFS re-exporting.

That means the Fixes: tag above is misleading. The proposed patch
doesn't actually fix that commit (which went into v5.19), it simply
applies on that commit.

I haven't been able to find the locking patches mentioned here. I think
those bear mentioning (by commit ID) in the patch description, at least.
If you know the commit ID, Trond, can you pass it along?

Though I would say that, in agreement with Jeff, the true cause of this
issue is the awkward synopsis for vfs_lock_file().

--
Chuck Lever