Hi Amir, I've confirmed that the flock scenario works with NFS3: (caseA, /mnt/n is mount with -overs=3) open("/mnt/n/foo", O_RDONLY) = 3 flock(3, LOCK_EX|LOCK_NB) = 0 I plan to go for NFS3 rather then change open flag, because the behavior comes from an Android build toolchain: https://android.googlesource.com/platform/art/+/master/dex2oat/dex2oat.cc, line 2327. It could take long time to change it. Thanks your nice advise about inconsistent ro/rw fd issue, I'll watch out for it. However, after I switch to NFS3, I got another problem by readdir(3). readdir(3) returns dirent->d_type = 0 (DT_UNKNOWN) to a NFS3 mount overlay-nfs-exported dir. Same overlay nfs share dir mount with NFS4 returns correct d_type. Although readdir(3) states that: Currently, only some filesystems (among them: Btrfs, ext2, ext3, and ext4) have full support for returning the file type in d_type. All applications must properly handle a return of DT_UNKNOWN. I tested ext4 shared dir by NFS3 mount, d_type works. Native access to overlay dir works too. Do you have idea where could the problem is? thanks, Eddie 2018-03-13 14:24 GMT+08:00 Amir Goldstein <amir73il@xxxxxxxxx>: > [CC some NFS/lock folks (see history below top post)] > > On Tue, Mar 13, 2018 at 3:39 AM, Eddie Horng <eddiehorng.tw@xxxxxxxxx> wrote: >> Hi Amir, >> Thanks your prompt response. After compare flock(1) and my flock(2) >> test program, it seems open flag makes the result different. strace >> result shows open with O_RDONLY flock fails (case A), open with >> O_RDWR|O_CREAT|O_NOCTTY flock works (case B) and open local ext4 file >> with O_RDONLY flock works too (case C) >> >> case A: >> strace myflock /mnt/n/foo >> open("/mnt/n/foo", O_RDONLY) = 3 >> flock(3, LOCK_EX|LOCK_NB) = -1 EBADF (Bad file descriptor) >> > > It looks like flock(1) has special code to handle this case for NFSv4 > and fall back to open O_RDRW: > https://github.com/karelzak/util-linux/blob/master/sys-utils/flock.c#L295 > > Although I tested with NFSv3 and open flags used by flock(1) > where O_RDONLY|O_CREAT|O_NOCTTY > > Why do you need to get an exclusive lock on a file that is open for read? > Can you open the file for write and resolve the issue like flock(1) does? > > You should know that even if you manage to lock a O_RDONLY fd, > if this file is then open for write by another process, that process will > get a file descriptor pointing to a *different* inode. > This is a long standing issue with overlayfs (inconsistent ro/rw fd), > which is being worked around by some user applications - > i.e. touch the file before first access to avoid applications > getting open file descriptor to lower inode. > > Let me know if this answer suffice or if you get this error only > with NFSv4 over overalyfs. > >> case B: >> strace flock -x -n /mnt/n/foo echo locked >> open("/mnt/n/foo", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 3 >> flock(3, LOCK_EX|LOCK_NB) = 0 >> >> case C: >> strace myflock /tmp/t >> open("/tmp/t", O_RDONLY) = 3 >> flock(3, LOCK_EX|LOCK_NB) = 0 >> > > So that presumably works because the test is not over NFS and not > because test is not over NFS+overlayfs, because of no NFSv4 flock > emulation. > > >> Below is my test configuration of case A: >> - underlying filesystem: >> ext4 >> - /proc/mounts: >> /dev/disk/by-uuid/a2d5005c-.... / ext4 >> rw,relatime,errors=remount-ro,data=ordered 0 0 >> none /share overlay >> rw,relatime,lowerdir=/base/lower,upperdir=/base/upper,workdir=/base/work,index=on,nfs_export=on >> 0 0 >> localhost:/share /mnt/n nfs4 >> rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1 >> 0 0 >> - /etc/exports >> /share *(rw,sync,no_subtree_check,no_root_squash,fsid=41) >> >> >> For dmesg, in case A, there's no any output from dmesg, however in my >> applications running with overlay nfs exported files, there are some >> lock related messages. Which lock call triggers it, need more >> investigation. >> The message from nfs server side is like: >> [ 872.940080] Leaked POSIX lock on dev=0x0:0x42 ino=0xf5a1 >> fl_owner=0000000023265f44 fl_flags=0x1 fl_type=0x1 fl_pid=1 >> [ 1939.829655] Leaked locks on dev=0x0:0x42 ino=0xf5a1: >> [ 1939.829659] POSIX: fl_owner=0000000023265f44 fl_flags=0x1 >> fl_type=0x1 fl_pid=1 >> > > I'm not sure what those mean. Maybe NFS folks can shed some light. > > Thanks, > Amir. > >> >> 2018-03-12 20:07 GMT+08:00 Amir Goldstein <amir73il@xxxxxxxxx>: >>> On Mon, Mar 12, 2018 at 9:38 AM, Eddie Horng <eddiehorng.tw@xxxxxxxxx> wrote: >>>> Hello Miklos, >>>> I'd like to report a flock(2) problem to overlay nfs-exported files. >>>> The error return from flock(2) is "Bad file descriptor". >>>> >>>> Environment: >>>> OS: Ubuntu 14.04.2 LTS >>>> Kernel: 4.16.0-041600rc4-generic (from >>>> http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc4/) >>>> >>>> Reproduce step: >>>> (nfs server side) >>>> mount -t overlay >>>> -orw,lowerdir=/mnt/ro,upperdir=/mnt/u,workdir=/mnt/w,nfs_export=on,index=on >>>> none /mnt/m >>>> touch /mnt/m/foo >>>> (nfs client side) >>>> mount server:/mnt/m /mnt/n >>>> >>>> flock /mnt/n/foo >>>> failed to lock file '/mnt/n/foo': Bad file descriptor >>>> >>> >>> Does not reproduce on my end. I am using v4.16-rc5, but I don't think >>> any of the fixes there are relevant to this failure. >>> >>> This is what I have for underlying fs, overlay and nfs mount options >>> (index and nfs_export are on by default in my kernel): >>> >>> /dev/mapper/storage-lower_layer on /base type xfs >>> (rw,relatime,attr2,inode64,noquota) >>> share on /share type overlay >>> (rw,relatime,lowerdir=/base/lower,upperdir=/base/upper/0,workdir=/base/upper/work0) >>> c800:/share on /mnt/t type nfs >>> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.91.126,mountvers=3,mountport=49494,mountproto=udp,local_lock=none,addr=192.168.91.126) >>> >>> $ touch /mnt/t/foo >>> $ flock -x -n /mnt/t/foo echo locked >>> locked >>> >>> Please share more information about nfs mount options and underlying filesystem >>> >>> Please check if you see any relevant errors/warnings in dmesg. >>> >>> Thanks, >>> Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html