On Tue, 2018-03-13 at 08:24 +0200, Amir Goldstein wrote: > [CC some NFS/lock folks (see history below top post)] > > On Tue, Mar 13, 2018 at 3:39 AM, Eddie Horng <eddiehorng.tw@xxxxxxxxx> wrote: > > Hi Amir, > > Thanks your prompt response. After compare flock(1) and my flock(2) > > test program, it seems open flag makes the result different. strace > > result shows open with O_RDONLY flock fails (case A), open with > > O_RDWR|O_CREAT|O_NOCTTY flock works (case B) and open local ext4 file > > with O_RDONLY flock works too (case C) > > > > case A: > > strace myflock /mnt/n/foo > > open("/mnt/n/foo", O_RDONLY) = 3 > > flock(3, LOCK_EX|LOCK_NB) = -1 EBADF (Bad file descriptor) > > > > It looks like flock(1) has special code to handle this case for NFSv4 > and fall back to open O_RDRW: > https://github.com/karelzak/util-linux/blob/master/sys-utils/flock.c#L295 > > Although I tested with NFSv3 and open flags used by flock(1) > where O_RDONLY|O_CREAT|O_NOCTTY > > Why do you need to get an exclusive lock on a file that is open for read? > Can you open the file for write and resolve the issue like flock(1) does? > > You should know that even if you manage to lock a O_RDONLY fd, > if this file is then open for write by another process, that process will > get a file descriptor pointing to a *different* inode. > This is a long standing issue with overlayfs (inconsistent ro/rw fd), > which is being worked around by some user applications - > i.e. touch the file before first access to avoid applications > getting open file descriptor to lower inode. > > Let me know if this answer suffice or if you get this error only > with NFSv4 over overalyfs. > > > case B: > > strace flock -x -n /mnt/n/foo echo locked > > open("/mnt/n/foo", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 3 > > flock(3, LOCK_EX|LOCK_NB) = 0 > > > > case C: > > strace myflock /tmp/t > > open("/tmp/t", O_RDONLY) = 3 > > flock(3, LOCK_EX|LOCK_NB) = 0 > > > > So that presumably works because the test is not over NFS and not > because test is not over NFS+overlayfs, because of no NFSv4 flock > emulation. > Agreed. The real issue here is that NFSv4 emulates flock locks using LOCK/LOCKT byte-range locks. The NFSv4 spec does not allow you to set a write lock on a file open read-only, so that just plain doesn't work on NFSv4. > > > Below is my test configuration of case A: > > - underlying filesystem: > > ext4 > > - /proc/mounts: > > /dev/disk/by-uuid/a2d5005c-.... / ext4 > > rw,relatime,errors=remount-ro,data=ordered 0 0 > > none /share overlay > > rw,relatime,lowerdir=/base/lower,upperdir=/base/upper,workdir=/base/work,index=on,nfs_export=on > > 0 0 > > localhost:/share /mnt/n nfs4 > > rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1 > > 0 0 > > - /etc/exports > > /share *(rw,sync,no_subtree_check,no_root_squash,fsid=41) > > > > > > For dmesg, in case A, there's no any output from dmesg, however in my > > applications running with overlay nfs exported files, there are some > > lock related messages. Which lock call triggers it, need more > > investigation. > > The message from nfs server side is like: > > [ 872.940080] Leaked POSIX lock on dev=0x0:0x42 ino=0xf5a1 > > fl_owner=0000000023265f44 fl_flags=0x1 fl_type=0x1 fl_pid=1 > > [ 1939.829655] Leaked locks on dev=0x0:0x42 ino=0xf5a1: > > [ 1939.829659] POSIX: fl_owner=0000000023265f44 fl_flags=0x1 > > fl_type=0x1 fl_pid=1 > > > > I'm not sure what those mean. Maybe NFS folks can shed some light. > That means that there was a file_lock associated with this struct file that was left on the POSIX lock list after filp_close. Either it didn't get released properly or a lock raced onto the list after locks_remove_posix ran. That should never happen, so this is likely a bug. > Thanks, > Amir. > > > > > 2018-03-12 20:07 GMT+08:00 Amir Goldstein <amir73il@xxxxxxxxx>: > > > On Mon, Mar 12, 2018 at 9:38 AM, Eddie Horng <eddiehorng.tw@xxxxxxxxx> wrote: > > > > Hello Miklos, > > > > I'd like to report a flock(2) problem to overlay nfs-exported files. > > > > The error return from flock(2) is "Bad file descriptor". > > > > > > > > Environment: > > > > OS: Ubuntu 14.04.2 LTS > > > > Kernel: 4.16.0-041600rc4-generic (from > > > > http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc4/) > > > > > > > > Reproduce step: > > > > (nfs server side) > > > > mount -t overlay > > > > -orw,lowerdir=/mnt/ro,upperdir=/mnt/u,workdir=/mnt/w,nfs_export=on,index=on > > > > none /mnt/m > > > > touch /mnt/m/foo > > > > (nfs client side) > > > > mount server:/mnt/m /mnt/n > > > > > > > > flock /mnt/n/foo > > > > failed to lock file '/mnt/n/foo': Bad file descriptor > > > > > > > > > > Does not reproduce on my end. I am using v4.16-rc5, but I don't think > > > any of the fixes there are relevant to this failure. > > > > > > This is what I have for underlying fs, overlay and nfs mount options > > > (index and nfs_export are on by default in my kernel): > > > > > > /dev/mapper/storage-lower_layer on /base type xfs > > > (rw,relatime,attr2,inode64,noquota) > > > share on /share type overlay > > > (rw,relatime,lowerdir=/base/lower,upperdir=/base/upper/0,workdir=/base/upper/work0) > > > c800:/share on /mnt/t type nfs > > > (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.91.126,mountvers=3,mountport=49494,mountproto=udp,local_lock=none,addr=192.168.91.126) > > > > > > $ touch /mnt/t/foo > > > $ flock -x -n /mnt/t/foo echo locked > > > locked > > > > > > Please share more information about nfs mount options and underlying filesystem > > > > > > Please check if you see any relevant errors/warnings in dmesg. > > > > > > Thanks, > > > Amir. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html