On Wed, Aug 23, 2023 at 5:50 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Wed, Aug 23, 2023 at 5:52 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > On Wed, 23 Aug 2023 at 16:43, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > If we do this, then both overlay.whiteout and overlay.xattr_whiteouts > > > xattrs will be xattrs that the overlayfs driver never sets. > > > It's a precedent, but as long as it is properly documented and encoded > > > in fstests, I will be fine with it. Not sure about Miklos. > > > > Firstly I need to properly understand the proposal. At this point I'm > > not sure what overlay.whiteout is supposed to mean. Does it mean the > > same as a whiteout (chrdev(0,0))? Or does it mean that overlayfs > > should not treat it as a whiteout, but instead transform that into a > > chrdev(0,0) for the top overlay to interpret as a whiteout? Or > > something else? > > > > My proposal does not involve any transformation. > It is "just" to support another format for a whiteout. > Transforming a REG or FIFO real object to CHR ovl object > will be a pain IMO and I don't see why it is needed. > > Let me try again from the top: > 1. ovl_path_is_whiteout() checks if either ovl_is_whiteout() (chardev(0,0)) > or regular file with "overlay.whiteout" xattr, so ovl_lookup() > will result in > a negative dentry if either whiteout format is found at topmost layer > 2. To optimize unneeded getxattr, "overlay.whiteout" xattr could be checked > only in case the parent dir has xattr "overlay.xattr_whiteouts" > 3. mkfs.composefs is responsible of creating the non-chardev whiteouts > as well as the marking the dirs that contains them with > "overlay.xattr_whiteouts" - overlayfs itself never does that > 4. ovl_calc_d_ino() (from readdir on a merge dir) returns true if the > the iterated dir has "overlay.xattr_whiteouts" xattr > 5. That will cause ovl_cache_update_ino() to lookup the > *overlay dentry* that will be negative (as per rule 1 above) > if either whiteout format is found at topmost layer and that > will cause the readdir dirent to be marked is_whiteout and > filtered out of readdir results > > * The trick for readdir is that the the per dirent DT_CHR optimization > is traded off for a per parent dir optimization, but even the worst case > where all directories have xattr_whiteouts, readdir is not more > expensive than readdir with xino enabled, which is the default for > several Linux distros > > Hope this is more clear? Ok, so I implemented this, both using the transforming-to-whiteout and the alternative-whiteout approach. Here is the transform-to-whiteout approach: https://github.com/alexlarsson/linux/tree/ovl-nesting-transform In it, if you have a lower dir with these files/xattrs: * lowerdir/foo - directory trusted.overlay.whiteouts * lowerdir/foo/hide_file - regular file trusted.overlay.whiteout Then you will get an overlay no-userxattr mount like this: * lowerdir/foo - directory * lowerdir/foo/hide_file - chardev(0,0) file This can be used as a lower in any overlayfs mount you want, userxattr or no. Here is the alternative-whiteout approach: https://github.com/alexlarsson/linux/tree/ovl-nesting-alternative In it, if you have a lower dir with these files/xattrs: * lowerdir/foo - directory trusted.overlay.overlay.whiteouts user.overlay.whiteouts * lowerdir/foo/hide_file - regular file trusted.overlay.overlay.whiteout user.overlay.whiteout Then you will get an overlay no-userxattr mount like this: * lowerdir/foo - directory trusted.overlay.whiteouts user.overlay.whiteouts * lowerdir/foo/hide_file - regular file trusted.overlay.whiteout user.overlay.whiteout This can be used as a lower in any overlayfs mount you want, userxattr or no. I prefer the transform-to-whiteout approach for a two reasons: Given an existing image (say an OCI image) we can construct an overlayfs mount that is not just functionally identical, but also indistinguible from the expected one. I.e. if the original OCI image had a chardev(0,0) we will still have one in the mount. When creating multiple lower dirs (e.g. from a multi-layered OCI image) you have to carry forward xattrs on directories from the lower layers to the upper, otherwise a merged directory from a higher layer will overwrite the "overlay.whiteouts" xattr. This makes an otherwise local operation (just escape the files in this layer) to a global one that depends on all layers. In terms of implementation complexity I think they are very similar. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx