Re: [PATCH] lockd: set other missing fields when unlocking files

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Tue, 8 Nov 2022 16:52:57 +0000

> On Nov 8, 2022, at 11:41 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> On Tue, 2022-11-08 at 14:57 +0000, Chuck Lever III wrote:
>> 
>>> On Nov 7, 2022, at 4:55 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>> 
>>>> On Nov 7, 2022, at 5:48 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>>>> 
>>>> On Sun, 2022-11-06 at 14:02 -0500, trondmy@xxxxxxxxxx wrote:
>>>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>>> 
>>>>> vfs_lock_file() expects the struct file_lock to be fully initialised by
>>>>> the caller.
>>> 
>>> As a reviewer, I don't see anything in the vfs_lock_file() kdoc
>>> comment that suggests this, and vfs_lock_file() itself is just
>>> a wrapper around each filesystem's f_ops->lock method. That
>>> expectation is a bit deeper into NFS-specific code. A few more
>>> observations below.
>>> 
>>> 
>>>>> Re-exported NFSv3 has been seen to Oops if the fl_file field
>>>>> is NULL.
>>> 
>>> Needs a Link: to the bug report. Which I can add.
>>> 
>>> This will also give us a call trace we can reference, so I won't
>>> add that here.
>>> 
>>> 
>>>>> Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files")
>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>>> ---
>>>>> fs/lockd/svcsubs.c | 17 ++++++++++-------
>>>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>>> 
>>>>> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
>>>>> index e1c4617de771..3515f17eaf3f 100644
>>>>> --- a/fs/lockd/svcsubs.c
>>>>> +++ b/fs/lockd/svcsubs.c
>>>>> @@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)
>>>>> 	}
>>>>> }
>>>>> 
>>>>> -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>>>>> +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl)
>>>>> {
>>>>> 	struct file_lock lock;
>>>>> 
>>>>> @@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>>>>> 	lock.fl_type  = F_UNLCK;
>>>>> 	lock.fl_start = 0;
>>>>> 	lock.fl_end   = OFFSET_MAX;
>>>>> -	lock.fl_owner = owner;
>>>>> -	if (file->f_file[O_RDONLY] &&
>>>>> -	    vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL))
>>>>> +	lock.fl_owner = fl->fl_owner;
>>>>> +	lock.fl_pid   = fl->fl_pid;
>>>>> +	lock.fl_flags = FL_POSIX;
>>>>> +
>>>>> +	lock.fl_file = file->f_file[O_RDONLY];
>>>>> +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>>>>> 		goto out_err;
>>>>> -	if (file->f_file[O_WRONLY] &&
>>>>> -	    vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL))
>>>>> +	lock.fl_file = file->f_file[O_WRONLY];
>>>>> +	if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>>>>> 		goto out_err;
>>>>> 	return 0;
>>>>> out_err:
>>>>> @@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
>>>>> 		if (match(lockhost, host)) {
>>>>> 
>>>>> 			spin_unlock(&flctx->flc_lock);
>>>>> -			if (nlm_unlock_files(file, fl->fl_owner))
>>>>> +			if (nlm_unlock_files(file, fl))
>>>>> 				return 1;
>>>>> 			goto again;
>>>>> 		}
>>>> 
>>>> Good catch.
>>>> 
>>>> I wonder if we ought to roll an initializer function for file_locks to
>>>> make it harder for callers to miss setting some fields like this? One
>>>> idea: we could change vfs_lock_file to *not* take a file argument, and
>>>> insist that the caller fill out fl_file when calling it? That would make
>>>> it harder to screw this up.
>>> 
>>> Commit history shows that, at least as far back as the beginning of
>>> the git era, the vfs_lock_file() call site here did not initialize
>>> the fl_file field. So, this code has been working without fully
>>> initializing @fl for, like, forever.
>>> 
>>> 
>>> Trond later says:
>>>> The regression occurs in 5.16, because that was when Bruce merged his
>>>> patches to enable locking when doing NFS re-exporting.
>>> 
>>> That means the Fixes: tag above is misleading. The proposed patch
>>> doesn't actually fix that commit (which went into v5.19), it simply
>>> applies on that commit.
>>> 
>>> I haven't been able to find the locking patches mentioned here. I think
>>> those bear mentioning (by commit ID) in the patch description, at least.
>>> If you know the commit ID, Trond, can you pass it along?
>>> 
>>> Though I would say that, in agreement with Jeff, the true cause of this
>>> issue is the awkward synopsis for vfs_lock_file().
>> 
>> Since Trond has re-assigned the kernel.org bug to me... I'll blather on
>> a bit more. (Yesterday's patch is still queued up, I can replace it or
>> move it depending on the outcome of this discussion).
>> 
>> -> The vfs_{test,lock,cancel}_file APIs all take a file argument. Maybe
>> we shouldn't remove the @filp argument from vfs_lock_file().
>> 
> 
> They all take a file_lock argument as well. @filp is redundant in all of
> them. Keeping both just increases the ambiguity. I move that we drop the
> explicit argument since we need to set it in the struct anyway.

Sounds good to me.

> We could also consider adding a @filp arguments to locks_alloc_lock and
> locks_init_lock, to make it a bit more evident that it needs to be set.
> 
>> -> The struct file_lock * argument of vfs_lock_file() is not a const.
>> 
> 
> That might be tough. Even for "request" fl's we modify some fields in
> them (for example, fl_wait and fl_blocked_member). fl_file should never
> change though, once it has been assigned. We could potentially make that
> const.
> 
>> After auditing the call sites, I think it would be safe for vfs_lock_file()
>> to explicitly overwrite the fl->fl_file field with the value of the @filp
>> argument before calling f_ops->lock. At the very least, it should sanity-
>> check that the two pointer values are the same, and document that as an
>> API requirement.
>> 
>> Alternatively we could cook up an NFS-specific fix... but the vfs_lock_file
>> API would still look dodgy.
>> 
> 
> I see no reason to do anything NFS-specific here. I'd be fine with
> WARN_ONs in locks.c for now, until we decide what to do longer term.
> It's possible we have some other call chains that are not setting that
> field correctly.

Agreed, a WARN_ON would be a good first step.

> If we can audit all of the call sites and ensure that they are properly
> setting fl_file in the struct, we should be able to painlessly drop the
> separate @filp argument from all of those functions.

The only one I found that doesn't set fl_file close to the vfs_lock_file
call site is do_lock_file_wait().

> I'll toss it onto my to-do pile.

I'm assuming you mean you'll do the API clean-up, and that I should
keep Trond's fix in the nfsd queue.

--
Chuck Lever