> On Nov 7, 2022, at 5:48 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Sun, 2022-11-06 at 14:02 -0500, trondmy@xxxxxxxxxx wrote: >> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >> >> vfs_lock_file() expects the struct file_lock to be fully initialised by >> the caller. As a reviewer, I don't see anything in the vfs_lock_file() kdoc comment that suggests this, and vfs_lock_file() itself is just a wrapper around each filesystem's f_ops->lock method. That expectation is a bit deeper into NFS-specific code. A few more observations below. >> Re-exported NFSv3 has been seen to Oops if the fl_file field >> is NULL. Needs a Link: to the bug report. Which I can add. This will also give us a call trace we can reference, so I won't add that here. >> Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files") >> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >> --- >> fs/lockd/svcsubs.c | 17 ++++++++++------- >> 1 file changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c >> index e1c4617de771..3515f17eaf3f 100644 >> --- a/fs/lockd/svcsubs.c >> +++ b/fs/lockd/svcsubs.c >> @@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file) >> } >> } >> >> -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner) >> +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl) >> { >> struct file_lock lock; >> >> @@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner) >> lock.fl_type = F_UNLCK; >> lock.fl_start = 0; >> lock.fl_end = OFFSET_MAX; >> - lock.fl_owner = owner; >> - if (file->f_file[O_RDONLY] && >> - vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL)) >> + lock.fl_owner = fl->fl_owner; >> + lock.fl_pid = fl->fl_pid; >> + lock.fl_flags = FL_POSIX; >> + >> + lock.fl_file = file->f_file[O_RDONLY]; >> + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL)) >> goto out_err; >> - if (file->f_file[O_WRONLY] && >> - vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL)) >> + lock.fl_file = file->f_file[O_WRONLY]; >> + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL)) >> goto out_err; >> return 0; >> out_err: >> @@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file, >> if (match(lockhost, host)) { >> >> spin_unlock(&flctx->flc_lock); >> - if (nlm_unlock_files(file, fl->fl_owner)) >> + if (nlm_unlock_files(file, fl)) >> return 1; >> goto again; >> } > > Good catch. > > I wonder if we ought to roll an initializer function for file_locks to > make it harder for callers to miss setting some fields like this? One > idea: we could change vfs_lock_file to *not* take a file argument, and > insist that the caller fill out fl_file when calling it? That would make > it harder to screw this up. Commit history shows that, at least as far back as the beginning of the git era, the vfs_lock_file() call site here did not initialize the fl_file field. So, this code has been working without fully initializing @fl for, like, forever. Trond later says: > The regression occurs in 5.16, because that was when Bruce merged his > patches to enable locking when doing NFS re-exporting. That means the Fixes: tag above is misleading. The proposed patch doesn't actually fix that commit (which went into v5.19), it simply applies on that commit. I haven't been able to find the locking patches mentioned here. I think those bear mentioning (by commit ID) in the patch description, at least. If you know the commit ID, Trond, can you pass it along? Though I would say that, in agreement with Jeff, the true cause of this issue is the awkward synopsis for vfs_lock_file(). -- Chuck Lever