Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem

Amir Goldstein <amir73il@xxxxxxxxx> · Wed, 25 Jan 2023 14:46:59 +0200

> >
> > Based on Alexander's explanation about the differences between overlayfs
> > lookup vs. composefs lookup of a regular "metacopy" file, I just need to
> > point out that the same optimization (lazy lookup of the lower data
> > file on open)
> > can be done in overlayfs as well.
> > (*) currently, overlayfs needs to lookup the lower file also for st_blocks.
> >
> > I am not saying that it should be done or that Miklos will agree to make
> > this change in overlayfs, but that seems to be the major difference.
> > getxattr may have some extra cost depending on in-inode xattr format
> > of erofs, but specifically, the metacopy getxattr can be avoided if this
> > is a special overlayfs RO mount that is marked as EVERYTHING IS
> > METACOPY.
> >
> > I don't expect you guys to now try to hack overlayfs and explore
> > this path to completion.
> > My expectation is that this information will be clearly visible to anyone
> > reviewing future submission, e.g.:
> >
> > - This is the comparison we ran...
> > - This is the reason that composefs gives better results...
> > - It MAY be possible to optimize erofs/overlayfs to get to similar results,
> >   but we did not try to do that
> >
> > It is especially important IMO to get the ACK of both Gao and Miklos
> > on your analysis, because remember than when this thread started,
> > you did not know about the metacopy option and your main argument
> > was saving the time it takes to create the overlayfs layer files in the
> > filesystem, because you were missing some technical background on overlayfs.
>
> we knew about metacopy, which we already use in our tools to create
> mapped image copies when idmapped mounts are not available, and also
> knew about the other new features in overlayfs.  For example, the
> "volatile" feature which was mentioned in your
> Overlayfs-containers-lpc-2020 talk, was only submitted upstream after
> begging Miklos and Vivek for months.  I had a PoC that I used and tested
> locally and asked for their help to get it integrated at the file
> system layer, using seccomp for the same purpose would have been more
> complex and prone to errors when dealing with external bind mounts
> containing persistent data.
>
> The only missing bit, at least from my side, was to consider an image
> that contains only overlay metadata as something we could distribute.
>

I'm glad that I was able to point this out to you, because now the comparison
between the overlayfs and composefs options is more fair.

> I previously mentioned my wish of using it from a user namespace, the
> goal seems more challenging with EROFS or any other block devices.  I
> don't know about the difficulty of getting overlay metacopy working in a
> user namespace, even though it would be helpful for other use cases as
> well.
>

There is no restriction of metacopy in user namespace.
overlayfs needs to be mounted with -o userxattr and the overlay
xattrs needs to use user.overlay. prefix.

w.r.t. the implied claim that composefs on-disk format is simple enough
so it could be made robust enough to avoid exploits, I will remain
silent and let others speak up, but I advise you to take cover,
because this is an explosive topic ;)

Thanks,
Amir.