Re: fsnotify path hooks

Amir Goldstein <amir73il@xxxxxxxxx> · Wed, 31 Mar 2021 23:59:27 +0300

On Wed, Mar 31, 2021 at 5:06 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> > > As long as "exp_export: export of idmapped mounts not yet supported.\n"
> > > I don't think it matters much.
> > > It feels like adding idmapped mounts to nfsd is on your roadmap.
> > > When you get to that we can discuss adding fsnotify path hooks to nfsd
> > > if Jan agrees to the fsnotify path hooks concept.
> >
> > I was looking at the patch and thinking about it for a few days already. I
> > think that generating fsnotify event later (higher up the stack where we
> > have mount information) is fine and a neat idea. I just dislike the hackery
> > with dentry flags.
>
> Me as well. I used this hack for fast POC.
>
> If we stick with the dual hooks approach, we will have to either pass a new
> argument to vfs helpers or use another trick:
>
> Convert all the many calls sites that were converted by Christian to:
>    vfs_XXX(&init_user_ns, ...
> because they do not have mount context, to:
>    vfs_XXX(NULL, ...
>
> Inside the vfs helpers, use init_user_ns when mnt_userns is NULL,
> but pass the original mnt_userns argument to fsnotify_ns_XXX hooks.
> A non-NULL mnt_userns arg means "path_notify" context.
> I have already POC code for passing mnt_userns to fsnotify hooks [1].
>
> I did not check if this assumption always works, but there seems to
> be a large overlap between idmapped aware callers and use cases
> that will require sending events to a mount mark.
>

The above "trick" is pretty silly as I believe Christian intends
to fix all those call sites that pass init_user_ns.

> > Also I'm somewhat uneasy that it is random (from
> > userspace POV) when path event is generated and when not (at least that's
> > my impression from the patch - maybe I'm wrong). How difficult would it be
> > to get rid of it? I mean what if we just moved say fsnotify_create() call
> > wholly up the stack? It would mean more explicit calls to fsnotify_create()
> > from filesystems - as far as I'm looking nfsd, overlayfs, cachefiles,
> > ecryptfs. But that would seem to be manageable.  Also, to maintain sanity,
>
> 1. I don't think we can do that for all the fsnotify_create() hooks, such as
>     debugfs for example
> 2. It is useless to pass the mount from overlayfs to fsnotify, its a private
>     mount that users cannot set a mark on anyway and Christian has
>     promised to propose the same change for cachefiles and ecryptfs,
>     so I think it's not worth the churn in those call sites
> 3. I am uneasy with removing the fsnotify hooks from vfs helpers and
>     trusting that new callers of vfs_create() will remember to add the high
>     level hooks, so I prefer the existing behavior remains for such callers
>

So I read your proposal the wrong way.
You meant move fsnotify_create() up *without* passing mount context
from overlayfs and friends.

So yeh, I do think it is manageable. I think the best solution would be
something along the lines of wrappers like the following:

static inline int vfs_mkdir(...)
{
        int error = __vfs_mkdir_nonotify(...);
        if (!error)
                fsnotify_mkdir(dir, dentry);
        return error;
}

And then the few call sites that call the fsnotify_path_ hooks
(i.e. in syscalls and perhaps later in nfsd) will call the
__vfs_xxx_nonotify() variant.

I suppose that with this approach I could make all the relevant events
available for mount mark with relatively little churn.
I will try it out.

Thanks,
Amir.