> > > > Based on Alexander's explanation about the differences between overlayfs > > lookup vs. composefs lookup of a regular "metacopy" file, I just need to > > point out that the same optimization (lazy lookup of the lower data > > file on open) > > can be done in overlayfs as well. > > (*) currently, overlayfs needs to lookup the lower file also for st_blocks. > > > > I am not saying that it should be done or that Miklos will agree to make > > this change in overlayfs, but that seems to be the major difference. > > getxattr may have some extra cost depending on in-inode xattr format > > of erofs, but specifically, the metacopy getxattr can be avoided if this > > is a special overlayfs RO mount that is marked as EVERYTHING IS > > METACOPY. > > > > I don't expect you guys to now try to hack overlayfs and explore > > this path to completion. > > My expectation is that this information will be clearly visible to anyone > > reviewing future submission, e.g.: > > > > - This is the comparison we ran... > > - This is the reason that composefs gives better results... > > - It MAY be possible to optimize erofs/overlayfs to get to similar results, > > but we did not try to do that > > > > It is especially important IMO to get the ACK of both Gao and Miklos > > on your analysis, because remember than when this thread started, > > you did not know about the metacopy option and your main argument > > was saving the time it takes to create the overlayfs layer files in the > > filesystem, because you were missing some technical background on overlayfs. > > we knew about metacopy, which we already use in our tools to create > mapped image copies when idmapped mounts are not available, and also > knew about the other new features in overlayfs. For example, the > "volatile" feature which was mentioned in your > Overlayfs-containers-lpc-2020 talk, was only submitted upstream after > begging Miklos and Vivek for months. I had a PoC that I used and tested > locally and asked for their help to get it integrated at the file > system layer, using seccomp for the same purpose would have been more > complex and prone to errors when dealing with external bind mounts > containing persistent data. > > The only missing bit, at least from my side, was to consider an image > that contains only overlay metadata as something we could distribute. > I'm glad that I was able to point this out to you, because now the comparison between the overlayfs and composefs options is more fair. > I previously mentioned my wish of using it from a user namespace, the > goal seems more challenging with EROFS or any other block devices. I > don't know about the difficulty of getting overlay metacopy working in a > user namespace, even though it would be helpful for other use cases as > well. > There is no restriction of metacopy in user namespace. overlayfs needs to be mounted with -o userxattr and the overlay xattrs needs to use user.overlay. prefix. w.r.t. the implied claim that composefs on-disk format is simple enough so it could be made robust enough to avoid exploits, I will remain silent and let others speak up, but I advise you to take cover, because this is an explosive topic ;) Thanks, Amir.