Hi Gao, CC vfs On Fri, Aug 30, 2024 at 5:29 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > It actually has been around for years: For containers and other sandbox > use cases, there will be thousands (and even more) of authenticated > (sub)images running on the same host, unlike OS images. > > Of course, all scenarios can use the same EROFS on-disk format, but > bdev-backed mounts just work well for OS images since golden data is > dumped into real block devices. However, it's somewhat hard for > container runtimes to manage and isolate so many unnecessary virtual > block devices safely and efficiently [1]: they just look like a burden > to orchestrators and file-backed mounts are preferred indeed. There > were already enough attempts such as Incremental FS, the original > ComposeFS and PuzzleFS acting in the same way for immutable fses. As > for current EROFS users, ComposeFS, containerd and Android APEXs will > be directly benefited from it. > > On the other hand, previous experimental feature "erofs over fscache" > was once also intended to provide a similar solution (inspired by > Incremental FS discussion [2]), but the following facts show file-backed > mounts will be a better approach: > - Fscache infrastructure has recently been moved into new Netfslib > which is an unexpected dependency to EROFS really, although it > originally claims "it could be used for caching other things such as > ISO9660 filesystems too." [3] > > - It takes an unexpectedly long time to upstream Fscache/Cachefiles > enhancements. For example, the failover feature took more than > one year, and the deamonless feature is still far behind now; > > - Ongoing HSM "fanotify pre-content hooks" [4] together with this will > perfectly supersede "erofs over fscache" in a simpler way since > developers (mainly containerd folks) could leverage their existing > caching mechanism entirely in userspace instead of strictly following > the predefined in-kernel caching tree hierarchy. > > After "fanotify pre-content hooks" lands upstream to provide the same > functionality, "erofs over fscache" will be removed then (as an EROFS > internal improvement and EROFS will not have to bother with on-demand > fetching and/or caching improvements anymore.) > > [1] https://github.com/containers/storage/pull/2039 > [2] https://lore.kernel.org/r/CAOQ4uxjbVxnubaPjVaGYiSwoGDTdpWbB=w_AeM6YM=zVixsUfQ@xxxxxxxxxxxxxx > [3] https://docs.kernel.org/filesystems/caching/fscache.html > [4] https://lore.kernel.org/r/cover.1723670362.git.josef@xxxxxxxxxxxxxx > > Closes: https://github.com/containers/composefs/issues/144 > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> Thanks for your patch, which is now commit fb176750266a3d7f ("erofs: add file-backed mount support"). > --- > v2: > - should use kill_anon_super(); > - add O_LARGEFILE to support large files. > > fs/erofs/Kconfig | 17 ++++++++++ > fs/erofs/data.c | 35 ++++++++++++--------- > fs/erofs/inode.c | 5 ++- > fs/erofs/internal.h | 11 +++++-- > fs/erofs/super.c | 76 +++++++++++++++++++++++++++++---------------- > 5 files changed, 100 insertions(+), 44 deletions(-) > > diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig > index 7dcdce660cac..1428d0530e1c 100644 > --- a/fs/erofs/Kconfig > +++ b/fs/erofs/Kconfig > @@ -74,6 +74,23 @@ config EROFS_FS_SECURITY > > If you are not using a security module, say N. > > +config EROFS_FS_BACKED_BY_FILE > + bool "File-backed EROFS filesystem support" > + depends on EROFS_FS > + default y I am a bit reluctant to have this default to y, without an ack from the VFS maintainers. > + help > + This allows EROFS to use filesystem image files directly, without > + the intercession of loopback block devices or likewise. It is > + particularly useful for container images with numerous blobs and > + other sandboxes, where loop devices behave intricately. It can also > + be used to simplify error-prone lifetime management of unnecessary > + virtual block devices. > + > + Note that this feature, along with ongoing fanotify pre-content > + hooks, will eventually replace "EROFS over fscache." > + > + If you don't want to enable this feature, say N. > + > config EROFS_FS_ZIP > bool "EROFS Data Compression Support" > depends on EROFS_FS Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds