On Fri, Jan 08, 2016 at 11:21:01AM -0500, J. Bruce Fields wrote: > On Fri, Jan 08, 2016 at 11:11:54AM -0500, Jeff Layton wrote: > > On Fri, 8 Jan 2016 10:55:33 -0500 > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > On Fri, Jan 08, 2016 at 08:50:09AM -0500, Jeff Layton wrote: > > > > Dmitry reported that he was able to reproduce the WARN_ON_ONCE that > > > > fires in locks_free_lock_context when the flc_posix list isn't empty. > > > > > > > > The problem turns out to be that we're basically rebuilding the > > > > file_lock from scratch in fcntl_setlk when we discover that the setlk > > > > has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END, > > > > then we may end up with fl_start and fl_end values that differ from > > > > when the lock was initially set, if the file position or length of the > > > > file has changed in the interim. > > > > > > > > Fix this by just reusing the same lock request structure, and simply > > > > override fl_type value with F_UNLCK as appropriate. That ensures that > > > > we really are unlocking the lock that was initially set. > > > > > > You could also just do a whole-file unlock, couldn't you? That would > > > seem less confusing to me. But maybe I'm missing something. > > > > > > --b. > > > > > > > I considered that too...but I was thinking that might make things even > > worse. Consider: > > > > Thread1 Thread2 > > ---------------------------------------------------------------------------- > > fd1 = open(...); > > fd2 = dup(fd1); > > fcntl(fd2, F_SETLK); > > (Here we call fcntl, and lock is set, but > > task gets scheduled out before fcheck) > > close(fd2) > > fcntl(fd1, F_SETLK...); > > Task scheduled back in, does fcheck for fd2 > > and finds that it's gone. Removes the lock > > that Thread1 just set. > > > > If we just unlock the range that was set then Thread1 won't be affected > > if his lock doesn't overlap Thread2's. > > > > Is that better or worse? :) > > > > TBH, I guess all of this is somewhat academic. If you're playing with > > traditional POSIX locks and threads like this, then you really are > > playing with fire. > > > > We should try to fix that if we can though... > > Yeah. I almost think an OK iterim solution would be just to document > the race in the appropriate man page and tell people that if they really > want to use posix locks in an application with lots of threads sharing > file descriptors then they should consider OFD locks. (Especially if this race has always existed.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html