On Fri, 2018-06-15 at 13:11 +0200, Jan Stancek wrote: > Hi, > > Attached is simplified reproducer (LTP fcntl36), where > 2 threads try to lock same region in a file. One is > using posix write lock, the other OFD read lock. > > Observed problem: 2 threads obtain lock simultaneously. > > --- strace excerpt --- > [pid 16853] 06:57:11 openat(AT_FDCWD, "tst_ofd_posix_locks", O_RDWR) = 3 > [pid 16854] 06:57:11 openat(AT_FDCWD, "tst_ofd_posix_locks", O_RDWR) = 4 > ... > [pid 16853] 06:57:12 fcntl(3, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=4096} <unfinished ...> > [pid 16854] 06:57:12 fcntl(4, F_OFD_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=0, l_len=4096} <unfinished ...> > [pid 16853] 06:57:12 <... fcntl resumed> ) = 0 > [pid 16853] 06:57:12 nanosleep({tv_sec=0, tv_nsec=100000}, <unfinished ...> > [pid 16854] 06:57:12 <... fcntl resumed> ) = 0 > --- /strace excerpt --- > > fcntl(2) says: > Conflicting lock combinations (i.e., a read lock and a write > lock or two write locks) where one lock is an open file > description lock and the other is a traditional record lock > conflict even when they are acquired by the same process on > the same file descriptor. > > Reproducible on x86_64 VM, with v4.17-11782-gbe779f03d563. > > Thanks for having a look, > Jan > tl;dr: I think the test program is buggy. You're running afoul of one of the behaviors of traditional POSIX locks that caused us to add OFD locks in the first place. On any call to close() all traditional POSIX locks in the process are dropped. Longer explanation: You have 3 thread pairs, and each one does a close(fd) at the end of the thread func. When you go to join the threads, it ends up calling close(fd), and that causes _all_ traditional POSIX locks to get released, even ones that might still be in use by other threads. If you comment out the close(fd); calls in both thread funcs then the program seems to reliably run to completion. -- Jeff Layton <jlayton@xxxxxxxxxx>