On Fri, 8 Jan 2016 10:55:33 -0500 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Fri, Jan 08, 2016 at 08:50:09AM -0500, Jeff Layton wrote: > > Dmitry reported that he was able to reproduce the WARN_ON_ONCE that > > fires in locks_free_lock_context when the flc_posix list isn't empty. > > > > The problem turns out to be that we're basically rebuilding the > > file_lock from scratch in fcntl_setlk when we discover that the setlk > > has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END, > > then we may end up with fl_start and fl_end values that differ from > > when the lock was initially set, if the file position or length of the > > file has changed in the interim. > > > > Fix this by just reusing the same lock request structure, and simply > > override fl_type value with F_UNLCK as appropriate. That ensures that > > we really are unlocking the lock that was initially set. > > You could also just do a whole-file unlock, couldn't you? That would > seem less confusing to me. But maybe I'm missing something. > > --b. > I considered that too...but I was thinking that might make things even worse. Consider: Thread1 Thread2 ---------------------------------------------------------------------------- fd1 = open(...); fd2 = dup(fd1); fcntl(fd2, F_SETLK); (Here we call fcntl, and lock is set, but task gets scheduled out before fcheck) close(fd2) fcntl(fd1, F_SETLK...); Task scheduled back in, does fcheck for fd2 and finds that it's gone. Removes the lock that Thread1 just set. If we just unlock the range that was set then Thread1 won't be affected if his lock doesn't overlap Thread2's. Is that better or worse? :) TBH, I guess all of this is somewhat academic. If you're playing with traditional POSIX locks and threads like this, then you really are playing with fire. We should try to fix that if we can though... > > > > While we're there, make sure that we do pop a WARN_ON_ONCE if the > > removal ever fails. Also return -EBADF in this event, since that's > > what we would have returned if the close had happened earlier. > > > > Cc: "J. Bruce Fields" <bfields@xxxxxxxxxxxx> > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> > > Fixes: c293621bbf67 (stale POSIX lock handling) > > Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> > > Signed-off-by: Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx> > > --- > > fs/locks.c | 51 ++++++++++++++++++++++++++++++--------------------- > > 1 file changed, 30 insertions(+), 21 deletions(-) > > > > diff --git a/fs/locks.c b/fs/locks.c > > index 593dca300b29..c263aff793bc 100644 > > --- a/fs/locks.c > > +++ b/fs/locks.c > > @@ -2181,7 +2181,6 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd, > > goto out; > > } > > > > -again: > > error = flock_to_posix_lock(filp, file_lock, &flock); > > if (error) > > goto out; > > @@ -2223,19 +2222,22 @@ again: > > * Attempt to detect a close/fcntl race and recover by > > * releasing the lock that was just acquired. > > */ > > - /* > > - * we need that spin_lock here - it prevents reordering between > > - * update of i_flctx->flc_posix and check for it done in close(). > > - * rcu_read_lock() wouldn't do. > > - */ > > - spin_lock(¤t->files->file_lock); > > - f = fcheck(fd); > > - spin_unlock(¤t->files->file_lock); > > - if (!error && f != filp && flock.l_type != F_UNLCK) { > > - flock.l_type = F_UNLCK; > > - goto again; > > + if (!error && file_lock->fl_type != F_UNLCK) { > > + /* > > + * We need that spin_lock here - it prevents reordering between > > + * update of i_flctx->flc_posix and check for it done in > > + * close(). rcu_read_lock() wouldn't do. > > + */ > > + spin_lock(¤t->files->file_lock); > > + f = fcheck(fd); > > + spin_unlock(¤t->files->file_lock); > > + if (f != filp) { > > + file_lock->fl_type = F_UNLCK; > > + error = do_lock_file_wait(filp, cmd, file_lock); > > + WARN_ON_ONCE(error); > > + error = -EBADF; > > + } > > } > > - > > out: > > locks_free_lock(file_lock); > > return error; > > @@ -2321,7 +2323,6 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd, > > goto out; > > } > > > > -again: > > error = flock64_to_posix_lock(filp, file_lock, &flock); > > if (error) > > goto out; > > @@ -2363,14 +2364,22 @@ again: > > * Attempt to detect a close/fcntl race and recover by > > * releasing the lock that was just acquired. > > */ > > - spin_lock(¤t->files->file_lock); > > - f = fcheck(fd); > > - spin_unlock(¤t->files->file_lock); > > - if (!error && f != filp && flock.l_type != F_UNLCK) { > > - flock.l_type = F_UNLCK; > > - goto again; > > + if (!error && file_lock->fl_type != F_UNLCK) { > > + /* > > + * We need that spin_lock here - it prevents reordering between > > + * update of i_flctx->flc_posix and check for it done in > > + * close(). rcu_read_lock() wouldn't do. > > + */ > > + spin_lock(¤t->files->file_lock); > > + f = fcheck(fd); > > + spin_unlock(¤t->files->file_lock); > > + if (f != filp) { > > + file_lock->fl_type = F_UNLCK; > > + error = do_lock_file_wait(filp, cmd, file_lock); > > + WARN_ON_ONCE(error); > > + error = -EBADF; > > + } > > } > > - > > out: > > locks_free_lock(file_lock); > > return error; > > -- > > 2.5.0 -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html