[cc the correct containers list] On Wed, Oct 16, 2024 at 2:45 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Tue, Oct 15, 2024 at 4:01 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > On Sun, Oct 13, 2024 at 06:34:18PM +0200, Amir Goldstein wrote: > > > On Fri, May 24, 2024 at 2:35 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > > > On Fri, May 24, 2024 at 1:19 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > > > > > A current limitation of open_by_handle_at() is that it's currently not possible > > > > > to use it from within containers at all because we require CAP_DAC_READ_SEARCH > > > > > in the initial namespace. That's unfortunate because there are scenarios where > > > > > using open_by_handle_at() from within containers. > > > > > > > > > > Two examples: > > > > > > > > > > (1) cgroupfs allows to encode cgroups to file handles and reopen them with > > > > > open_by_handle_at(). > > > > > (2) Fanotify allows placing filesystem watches they currently aren't usable in > > > > > containers because the returned file handles cannot be used. > > > > > > > > > > > Christian, > > > > > > Follow up question: > > > Now that open_by_handle_at(2) is supported from non-root userns, > > > What about this old patch to allow sb/mount watches from non-root userns? > > > https://lore.kernel.org/linux-fsdevel/20230416060722.1912831-1-amir73il@xxxxxxxxx/ > > > > > > Is it useful for any of your use cases? > > > Should I push it forward? > > > > Dammit, I answered that message already yesterday but somehow it didn't > > get sent or lost in some other way. > > > > I personally don't have a use-case for it but the systemd folks might > > and it would be best to just rope them in. > > Lennart, > > I must have asked this question before, but enough time has passed so > I am going to ask it again. > > Now that Christian has added support for open_by_handle_at(2) by non-root > userns admin, it is a very low hanging fruit to support fanotify sb/mount > watches inside userns with this simple patch [1], that was last posted in 2011. > > My question is whether this is useful, because there are still a few > limitations. > I will start with what is possible with this patch: > 1. Watch an entire tmpfs filesystem that was mounted inside userns > 2. Watch an entire overlayfs filesystem that was mounted [*] inside userns > 3. Watch an entire mount [**] of any [***] filesystem that was > idmapped mounted into userns > > Now the the fine prints: > [*] Overlayfs sb/mount CAN be watched, but decoding file handle in > events to path > only works if overlayfs is mounted with mount option > nfs_export=on, which conflicts > with mount option metacopy=on, which is often used in containers > (e.g. podman) > [**] Watching a mount is only possible with the legacy set of fanotify events > (i.e. open,close,access,modify) so this is less useful for > directory tree change tracking > [***] Watching an idmapped mount has the same limitations as watching > an sb/mount > in the root userns, namely, filesystem needs to have a non zero > fsid (so not FUSE) > and filesystem needs to have a uniform fsid (so not btrfs > subvolume), although > with some stretch, I could make watching an idmapped mount of > btrfs subvol work. > > No support for watching btrfs subvol and overlayfs with metacopy=on, > reduces the attractiveness for containers, but perhaps there are still use cases > where watching an idmapped mount or userns private tmpfs are useful? > > To try out this patch inside your favorite container/userns, you can build > fsnotifywait with a patch to support watching inside userns [2]. > It's actually only the one lines O_DIRECTORY patch that is needed for the > basic tmpfs userns mount case. > > Jan, > > If we do not get any buy-in from potential consumers now, do you think that > we should go through with the patch and advertise the new supported use cases, > so that users may come later on? > > Thanks, > Amir. > > [1] https://github.com/amir73il/linux/commits/fanotify_userns/ > [2] https://github.com/amir73il/inotify-tools/commits/fanotify_userns/