Re: [PATCH] lockd: set other missing fields when unlocking files

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Mon, 7 Nov 2022 20:50:41 +0000

> On Nov 7, 2022, at 15:34, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> 
> 
> 
>> On Nov 7, 2022, at 3:22 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>> 
>> On Mon, 2022-11-07 at 18:42 +0000, Trond Myklebust wrote:
>>> 
>>>> On Nov 7, 2022, at 09:12, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Nov 7, 2022, at 5:48 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>>>>> 
>>>>> On Sun, 2022-11-06 at 14:02 -0500, trondmy@xxxxxxxxxx wrote:
>>>>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>>>> 
>>>>>> vfs_lock_file() expects the struct file_lock to be fully initialised by
>>>>>> the caller. Re-exported NFSv3 has been seen to Oops if the fl_file field
>>>>>> is NULL.
>>>>>> 
>>>>>> Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files")
>>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>>>> ---
>>>>>> fs/lockd/svcsubs.c | 17 ++++++++++-------
>>>>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>>>> 
>>>>>> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
>>>>>> index e1c4617de771..3515f17eaf3f 100644
>>>>>> --- a/fs/lockd/svcsubs.c
>>>>>> +++ b/fs/lockd/svcsubs.c
>>>>>> @@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)
>>>>>> }
>>>>>> }
>>>>>> 
>>>>>> -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>>>>>> +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl)
>>>>>> {
>>>>>> struct file_lock lock;
>>>>>> 
>>>>>> @@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
>>>>>> lock.fl_type  = F_UNLCK;
>>>>>> lock.fl_start = 0;
>>>>>> lock.fl_end   = OFFSET_MAX;
>>>>>> - lock.fl_owner = owner;
>>>>>> - if (file->f_file[O_RDONLY] &&
>>>>>> -    vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL))
>>>>>> + lock.fl_owner = fl->fl_owner;
>>>>>> + lock.fl_pid   = fl->fl_pid;
>>>>>> + lock.fl_flags = FL_POSIX;
>>>>>> +
>>>>>> + lock.fl_file = file->f_file[O_RDONLY];
>>>>>> + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>>>>>> goto out_err;
>>>>>> - if (file->f_file[O_WRONLY] &&
>>>>>> -    vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL))
>>>>>> + lock.fl_file = file->f_file[O_WRONLY];
>>>>>> + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL))
>>>>>> goto out_err;
>>>>>> return 0;
>>>>>> out_err:
>>>>>> @@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
>>>>>> if (match(lockhost, host)) {
>>>>>> 
>>>>>> spin_unlock(&flctx->flc_lock);
>>>>>> - if (nlm_unlock_files(file, fl->fl_owner))
>>>>>> + if (nlm_unlock_files(file, fl))
>>>>>> return 1;
>>>>>> goto again;
>>>>>> }
>>>>> 
>>>>> Good catch.
>>>>> 
>>>>> I wonder if we ought to roll an initializer function for file_locks to
>>>>> make it harder for callers to miss setting some fields like this? One
>>>>> idea: we could change vfs_lock_file to *not* take a file argument, and
>>>>> insist that the caller fill out fl_file when calling it? That would make
>>>>> it harder to screw this up.
>>>>> 
>>>>> In any case, let's take this patch in the interim while we consider
>>>>> whether and how to clean this up.
>>>>> 
>>>>> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
>>>> 
>>>> Since this doesn't fix breakage in 6.1-rc, I plan to take it for 6.2.
>>>> If all y'all feel the fix is more urgent than that, let me know.
>>> 
>>> 
>>> It is relevant to fixing https://bugzilla.kernel.org/show_bug.cgi?id=216582
>>> No idea how urgent that is...
>>> 
>> 
>> Seems like it's technically a regression then. Prior to aec158242b87,
>> those locks were being ignored. Now that we actually try to unlock them,
>> this causes a crash.
> 
> The reporter can reproduce a crash back to v5.16. So, it's a regression,
> but not one in v6.1-rc. I'm trying to be more strict about that to prevent
> quickly backporting fixes that have bugs.
> 
> 
>> I move for sending it to mainline sooner rather than later.
> 
> I'd rather give this one more time in linux-next. The Fixes: tag will
> trigger automatic backport once v6.2-rc1 closes. The fix is available
> to the reporter to apply to his kernel.

The regression occurs in 5.16, because that was when Bruce merged his patches to enable locking when doing NFS re-exporting. The workaround is therefore to mount the NFS filesystem with -o nolock on the re-exporting server.

That said, the patch is trivially correct.

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx