On Tue, Aug 15, 2023 at 5:09 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > Bringing this design discussion to the list. > > On Tue, Aug 15, 2023 at 3:53 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Tue, Aug 15, 2023 at 3:33 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > > > > > > Hi, as discussed in https://github.com/containers/composefs/issues/172 > > > we have some issues with a composefs rootfs that contains lower dirs. > > > The "inner" overlayfs will consume the overlay xattrs and the > > > whiteouts that were targeting the outer instance. > > > > > > I came up with this patchset to fix this: > > > https://github.com/alexlarsson/linux/tree/ovl-nesting > > > > > > There are two parts, first of all, any xattrs called > > > overlay.escape.XYZ in the lower/upper will be not be treated specially > > > by overlayfs itself, but exposed as overlay.XYZ in the overlayfs > > > mount. This can be nested by using overlay.escape.escape.XYZ, although > > > the stacking limit makes this stop at two levels. > > > > This part looks fine to me. > > I'd consider overlay.overlay.XYZ for the nested overlay xattr prefix, > > but that is just my own taste of playful aesthetics ;-) > > Heh. > > > > Secondly, and whiteouts in lower/upper that have a overlay.nowhiteout > > > xattrs will not be treated as a whiteout by the overlayfs mount. > > > However, since this xattr is stripped as part of the overlayfs mount, > > > the outer instance will apply it. This can be combined with escaping > > > to create a whiteout in a stacked overlayfs instance. > > > > > > > It looks sane. > > > > I am a bit concerned about introducing extra overhead for the > > common case, but maybe whiteouts are rare enough or the overhead > > low enough that it shouldn't matter? > > Yeah, I was somewhat worried about this too, and this is the reason > the code is structured so that it only ever reads this new xattr for > whiteout files. The hope is indeed that they are uncommon enough in > practice to not make this a large issue. > > One thing I noticed is that whiteouts are used in a bunch of places > for the handling of index files. I'm not completely up to speed on > these, but it looks like they are always in the work dir, am I right? Technically, they are in the "indexdir" (i.e. work/index not work/work), but when ofs->indexdir exists ofs->workdir == ofs->indexdir. The whiteouts in indexdir indicates that open_by_handle_at() should return ESTALE if a whiteout index is found (by handle). > So, we could potentially avoid the check for a nowhiteout xattr for > these for a slight performance improvement. Correct? Correct. > > > > Does this make sense to you? > > > > Yes, but it needs documentation. > > I'll add some. > > > > Do we need options for enabling these features? > > > > Don't think so. > > I see no risk of regressions (i.e. that the escaped xattrs currently > > exist in the wild). > > > > It is easy for userspace to test if kernel supports xattr escaping > > even if there are no escaped xattrs, getxattr(trusted.overlay.escaped.blah) > > would return either ENOTSUPP or ENODATA (if escaping is supported). > > Yeah, that makes sense. > Thanks, Amir.