Re: [PATCH v2 4/5] ovl: Support creation of whiteout files on overlayfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 22, 2023 at 5:31 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Tue, Aug 22, 2023 at 5:36 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote:
> >
> > On Tue, Aug 22, 2023 at 4:25 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Aug 22, 2023 at 3:56 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> > > >
> > > > On Tue, 22 Aug 2023 at 15:22, Alexander Larsson <alexl@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, Aug 21, 2023 at 1:00 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Thu, 17 Aug 2023 at 13:05, Alexander Larsson <alexl@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > This is needed to properly stack overlay filesystems, I.E, being able
> > > > > > > to create a whiteout file on an overlay mount and then use that as
> > > > > > > part of the lowerdir in another overlay mount.
> > > > > > >
> > > > > > > The way this works is that we create a regular whiteout, but set the
> > > > > > > `overlay.nowhiteout` xattr on it. Whenever we check if a file is a
> > > > > > > whiteout we check this xattr and don't treat it as a whiteout if it is
> > > > > > > set. The xattr itself is then stripped and when viewed as part of the
> > > > > > > overlayfs mount it looks like a regular whiteout.
> > > > > > >
> > > > > >
> > > > > > I understand the motivation, but don't have good feelings about the
> > > > > > implementation.  Like the xattr escaping this should also have the
> > > > > > property that when fed to an old kernel version, it shouldn't
> > > > > > interpret this object as a whiteout.  Whether it remains hidden like
> > > > > > the escaped xattrs or if it shows up as something else is
> > > > > > uninteresting.
> > > > > >
> > > > > > It could just be a zero sized regular file with "overlay.whiteout".
> > > > >
> > > > > So, I started doing this, where a whiteout is just a regular file with
> > > > > the xattr set. Initially I thought I only needed to check the xattr
> > > > > during lookup and convert the inode mode from S_IFREG to S_IFCHR.
> > > > > However, I also need to hook up readdir and convert DT_REG to DT_CHR,
> > > > > otherwise readdir will report the wrong d_type. To make it worse,
> > > > > overlayfs itself looks for DT_CHR to handle whiteouts when listing
> > > > > files, so nesting is not working without that.
> > > > >
> > > > > The only way I see to implement that conversion is to call getxattr()
> > > > > on every DT_REG file during readdir(), and while a single getxattr()
> > > > > on lookup is fine, I don't think that is.
> > > > >
> > > > > Any other ideas?
> > > >
> > > > Not messing with d_type seems a good idea.   How about a random
> > > > unreserved chardev?
> > >
> > > Only the whiteout one (0,0) can be created by non-root users.
> >
> > I was thinking of (ab)using DT_SOCK or DT_FIFO, but turns out you
> > can't store xattrs on such files.
>
> FWIW, there is also DT_WHT that was defined and never used.
> But that is just an anecdote.
>
> Regarding the issue of avoiding getxattr for every dirent.
> Note that in readdir, dirent that goes through ovl_cache_update_ino()
> calls lookup()/stat() on the overlay itself, so as long as ovl_lookup()
> will treat overlay.whiteout file as a whiteout, the code
>                  /* Mark a stale entry */
>                 p->is_whiteout = true;
> will kick in and do the right thing for readdir wrt cleaning up
> lower entries covered with whiteouts, regardless of DT_CHR.
>
> Now all that is left is to make sure that ovl_cache_update_ino()
> is called if there are possibly overlay.whiteout files.
>
> For the case of nested ovl, upper and lower fs cannot be the same,
> so ovl_same_fs() is false.
> Therefore, as long as xino is enabled (this is the default),
> ovl_same_dev() is true => ovl_xino_bits() > 0 =>
> ovl_calc_d_ino() is true and ovl_cache_update_ino() will be
> called for all merged dirents.
>
> IOW, unless I am missing something, if you implement overlay.whiteout
> logic in ovl_lookup() correctly, readdir should "just work" as long as
> the mounter does not explicitly use -o xino=off.

Things are not that rosy for this.

First of all, the default value for OVERLAY_FS_XINO_AUTO is no, so by
default xino is not enabled. This means that overlay.whiteout only
works if you enable xino=on/auto in the mount.

Secondly, in the case where all upper and lower are on the same fs,
even if xino is enabled it is ignored. This is not the  case where the
lower is on a nested  overlayfs as you say, but maybe we want to
create a lower dir that works on both when stored on a overlayfs and
elsewere. Relying on xino which is essentially unrelated to whiteouts
to get enabled seems quite fragile.

I'm gonna instead try to make the alternative whiteout be a fifo with
an xattr. Then we can add DT_FIFO to the maybe_whiteout case during
readdir() which will then get picked up as a whiteout without relying
on xino.

I still have an issue with this approach though. Suppose we have a
rootfs that we want to make available as an overlayfs lower (using
trusted.overlay.*). We can escape all the xattrs in the rootfs with
trusted.overlay.overlay to make them show up correctly. However,
suppose the rootfs contains a char(0.0) whiteout. how do we encode
this? We can use a file with trusted.overlay.overlay.whiteout, which
then will be visible as trusted.overlay.whiteout in the overlayfs
mount. This is normally functionally equivalent to the char(0,0)
whiteout, but not if the user of the rootfs mounts the second
overlayfs using userxattr.

Maybe it could convert a char(0,0) to a file with both
trusted.overlay.overlay.whiteout and user.overlay.whiteout? The the
resulting overlayfs file could be used as a whiteout for both types,
just like the char node...

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                Red Hat, Inc
       alexl@xxxxxxxxxx         alexander.larsson@xxxxxxxxx





[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux