On Sat, Apr 10, 2021 at 1:04 AM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > Hi, > > The primary problem is Bolt (Thunderbolt 3) tests that are > experiencing a regression when run in a container using overlayfs, > failing at: > > Bail out! ERROR:../tests/test-common.c:1413:test_io_dir_is_empty: > 'empty' should be FALSE > > https://gitlab.freedesktop.org/bolt/bolt/-/issues/171#note_872119 > To summarize, the test case is: - create empty dir - open empty dir - getdents => (".", "..") - create file at (dirfd, "a", - lseek to offset 0 on dirfd - getdents => (".", "..") FAIL to see "a" It looks like a bug in ovl readdir cache invalidation only there is not supposed to be any caching of pure upper dir. Once thing I noticed is that ovl_dentry_version_inc() is inconsistent with ovl_dir_is_real() - the latter checks whether readdir caching would be used and the former checks whether invalidating readdir cache is needed. We need to change ovl_dentry_version_inc() test to: if (ovl_test_flag(OVL_WHITEOUTS, dir) || impurity) Or better yet: if (!ovl_dir_is_real() || impurity) But this still doesn't explain the reported issue. The OVL_WHITEOUTS inode flag is set in ovl_get_inode() in several cases including: ovl_check_origin_xattr(ofs, upperdentry) So now we are getting closer to something that sounds related to the reported issue... ovl_check_origin_xattr() would return true if vfs_getxattr(upperdentry, "trusted.overlay.origin", NULL, 0) would return 0 instead of -ENODATA for some reason even though that xattr does not exist. But we happen to be missing a pr_debug() in ovl_do_getxattr(), so it's hard to say what's going on. Chris, As the first step, can you try the suggested fix to ovl_dentry_version_inc() and/or adding the missing pr_debug() and including those prints in your report? > I can reproduce this with 5.12.0-0.rc6.184.fc35.x86_64+debug and at > approximately the same time I see one, sometimes more, kernel > messages: > > [ 6295.379283] overlayfs: upper fs does not support xattr, falling > back to index=off and metacopy=off. > Can you say why there is no xattr support? Is the overlayfs mount executed without privileges to create trusted.* xattrs? The answer to that may be the key to understanding the bug. > But I don't know if that kernel message relates to the bolt test failure. > > If I run the test outside of a container, it doesn't fail. If I run > the test in a podman container using the btrfs driver instead of the > overlay driver, it doesn't fail. So it seems like this is an overlayfs > bug, but could be some kind of overlayfs+btrfs interaction. > My guess is it has to do with changes related to mounting overlayfs inside userns, but I couldn't find any immediate suspects. Do you have any idea since when the regression appeared? A bisect would have been helpful here. > Could this be related and just not yet merged? > https://lore.kernel.org/linux-unionfs/20210309162654.243184-1-amir73il@xxxxxxxxx/ > Not likely. If you want to be sure do: echo N > /sys/module/overlay/parameters/xino_auto Before starting the container. Above commit only matters for xino_auto = Y. Thanks, Amir.