On Tue, Aug 22, 2023 at 6:43 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > On Tue, Aug 22, 2023 at 5:31 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Tue, Aug 22, 2023 at 5:36 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > > > > On Tue, Aug 22, 2023 at 4:25 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > > > > > > On Tue, Aug 22, 2023 at 3:56 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > > > > > > > On Tue, 22 Aug 2023 at 15:22, Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Aug 21, 2023 at 1:00 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Thu, 17 Aug 2023 at 13:05, Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > This is needed to properly stack overlay filesystems, I.E, being able > > > > > > > > to create a whiteout file on an overlay mount and then use that as > > > > > > > > part of the lowerdir in another overlay mount. > > > > > > > > > > > > > > > > The way this works is that we create a regular whiteout, but set the > > > > > > > > `overlay.nowhiteout` xattr on it. Whenever we check if a file is a > > > > > > > > whiteout we check this xattr and don't treat it as a whiteout if it is > > > > > > > > set. The xattr itself is then stripped and when viewed as part of the > > > > > > > > overlayfs mount it looks like a regular whiteout. > > > > > > > > > > > > > > > > > > > > > > I understand the motivation, but don't have good feelings about the > > > > > > > implementation. Like the xattr escaping this should also have the > > > > > > > property that when fed to an old kernel version, it shouldn't > > > > > > > interpret this object as a whiteout. Whether it remains hidden like > > > > > > > the escaped xattrs or if it shows up as something else is > > > > > > > uninteresting. > > > > > > > > > > > > > > It could just be a zero sized regular file with "overlay.whiteout". > > > > > > > > > > > > So, I started doing this, where a whiteout is just a regular file with > > > > > > the xattr set. Initially I thought I only needed to check the xattr > > > > > > during lookup and convert the inode mode from S_IFREG to S_IFCHR. > > > > > > However, I also need to hook up readdir and convert DT_REG to DT_CHR, > > > > > > otherwise readdir will report the wrong d_type. To make it worse, > > > > > > overlayfs itself looks for DT_CHR to handle whiteouts when listing > > > > > > files, so nesting is not working without that. > > > > > > > > > > > > The only way I see to implement that conversion is to call getxattr() > > > > > > on every DT_REG file during readdir(), and while a single getxattr() > > > > > > on lookup is fine, I don't think that is. > > > > > > > > > > > > Any other ideas? > > > > > > > > > > Not messing with d_type seems a good idea. How about a random > > > > > unreserved chardev? > > > > > > > > Only the whiteout one (0,0) can be created by non-root users. > > > > > > I was thinking of (ab)using DT_SOCK or DT_FIFO, but turns out you > > > can't store xattrs on such files. > > > > FWIW, there is also DT_WHT that was defined and never used. > > But that is just an anecdote. > > > > Regarding the issue of avoiding getxattr for every dirent. > > Note that in readdir, dirent that goes through ovl_cache_update_ino() > > calls lookup()/stat() on the overlay itself, so as long as ovl_lookup() > > will treat overlay.whiteout file as a whiteout, the code > > /* Mark a stale entry */ > > p->is_whiteout = true; > > will kick in and do the right thing for readdir wrt cleaning up > > lower entries covered with whiteouts, regardless of DT_CHR. > > We don't want to treat this file as a whiteout though. We want it to > be exposed as a regular file that looks like a whiteout marker file > (i.e. char dev 0,0). Or am I missing something? > Not sure if you really need to emulate chardev(0,0) at all. Suppose that you just define a new way to express a whiteout - an empty regular file with xattr overlay.whiteout. Now you could use either chardev(0,0) or overlay.whiteout to compose overlayfs layers, although internally, ovl driver only creates chardev(0,0) to cover lower dentries. I think that is what Miklos meant? Now you don't need to implement mknod(c,0,0) in overlayfs. You need to teach ovl_lookup() about the new whiteout format (which I think you already did) and the problem you mentioned w.r.t readdir and DT_CHR is moot as long as the composefs overlayfs, whose lower layer is the ovl containing overlay.whiteout files is mounted with the default xino enabled. Did I miss anything? Thanks, Amir.