>This problem should not exist in stable kernel >= v4.14 But My host's kernel version is 4.9.29. Sorry, I got it wrong.I just see v4.14 as 4.1.4.I will ask my classmate to update the kernel version.Thanks!! On Mon, Apr 1, 2019 at 10:52 AM koishi komeiji <maykagura@xxxxxxxxx> wrote: > > Sorry for replying late. > >This problem should not exist in stable kernel >= v4.14 > But My host's kernel version is 4.9.29. > > Kernel info: > #uname -a > Linux z15572-virtual-machine 4.9.29 #1 SMP Tue Oct 16 22:13:48 CST > 2018 x86_64 x86_64 x86_64 GNU/Linux > > #cat /proc/version > Linux version 4.9.29 (root) (gcc version 5.4.0 20160609 (Ubuntu > 5.4.0-6ubuntu1~16.04.11) ) #1 SMP Tue Oct 16 22:13:48 CST 2018 > > >The problem described in the patch was solved in upstream kernel by > >commits: > >09d8b586731b ovl: move __upperdentry to ovl_inode > >31747eda41ef ovl: hash directory inodes for fsnotify > > Have these two commits been submitted to the kernel 4.9.29? > > I will try to use your patch. Thanks U very much ! > > On Fri, Mar 29, 2019 at 7:56 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Fri, Mar 29, 2019 at 10:02 AM koishi komeiji <maykagura@xxxxxxxxx> wrote: > > > > >> I imagine that if you restart container or just drop caches problem > > >> goes away? > > > > > > I try to drop caches, the problem goes away immediately. > > > So I think the problem is about dentry and inode cache. > > > > > > # ls -l | grep invalid > > > ls: cannot read symbolic link librmifm.so: Invalid argument > > > ls: cannot read symbolic link libdrv_vlan.so: Invalid argument > > > # > > > # echo 2 > /proc/sys/vm/drop_caches > > > # > > > # ls -l | grep invalid > > > # > > > > > > > > > > > >> Do you know if the symlink was just recently created or if it existed before > > >> container has been started? > > > > > > The symlink in the image is ok. when the container start, the symlink > > > is decompressed from a cpio file.In this time ,it is still ok. Just > > > after some time, it break suddenly. > > > > > > > > > > Koishi, > > > > I have a theory, but it is not so easy to test. > > I am not sure if you are able to compile and install a new overlayfs module. > > Attached is un-tested patch to solve the speculated problem. > > > > This problem should not exist in stable kernel >= v4.14, so if you > > can update the host kernel that would be the best option for you. > > > > The problem described in the patch was solved in upstream kernel by > > commits: > > 09d8b586731b ovl: move __upperdentry to ovl_inode > > 31747eda41ef ovl: hash directory inodes for fsnotify > > > > But those depend on many other changes, so cannot be easily > > backported to kernel v4.9. > > > > What I speculate that happens is: > > 1. Some ovl dentry holds a reference on upper dentry > > 2. upper dentry hold a ref on upper NON-symlink inode > > 3. overlay inode->i_private point to upper inode, but does not > > hold a refcount to upper inode > > 4. overlay inode is hashed by value of upper inode pointer > > 5. At some point, overlay dentry, upper dentry and upper inode > > are dropped, but overlay inode remains with elevated refcount > > (maybe by inotify) and hashed > > 6. A new lookup of the symlink allocates a new upper inode > > struct and reuses the address of NON-symlink free upper inode > > 7. ovl lookup finds the old ovl inode in cache by upper inode > > address, but the found overlay inode is not a symlink > > 8. The new ovl dentry is !d_is_symlink, although stat(2) will show > > i_mode from real inode that is a symlink > > 9. readlink(2) of ovl dentry will also fail with -EINVAL > > 10. open(2) on ovl dentry will not try to follow symlink and will > > eventually fail to open the symlink itself with -ELOOP > > > > Thanks, > > Amir.