Re: fsnotify path hooks

Jan Kara <jack@xxxxxxx> · Thu, 1 Apr 2021 12:29:47 +0200

On Wed 31-03-21 23:59:27, Amir Goldstein wrote:
> On Wed, Mar 31, 2021 at 5:06 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> >
> > > > As long as "exp_export: export of idmapped mounts not yet supported.\n"
> > > > I don't think it matters much.
> > > > It feels like adding idmapped mounts to nfsd is on your roadmap.
> > > > When you get to that we can discuss adding fsnotify path hooks to nfsd
> > > > if Jan agrees to the fsnotify path hooks concept.
> > >
> > > I was looking at the patch and thinking about it for a few days already. I
> > > think that generating fsnotify event later (higher up the stack where we
> > > have mount information) is fine and a neat idea. I just dislike the hackery
> > > with dentry flags.
> >
> > Me as well. I used this hack for fast POC.
> >
> > If we stick with the dual hooks approach, we will have to either pass a new
> > argument to vfs helpers or use another trick:
> >
> > Convert all the many calls sites that were converted by Christian to:
> >    vfs_XXX(&init_user_ns, ...
> > because they do not have mount context, to:
> >    vfs_XXX(NULL, ...
> >
> > Inside the vfs helpers, use init_user_ns when mnt_userns is NULL,
> > but pass the original mnt_userns argument to fsnotify_ns_XXX hooks.
> > A non-NULL mnt_userns arg means "path_notify" context.
> > I have already POC code for passing mnt_userns to fsnotify hooks [1].
> >
> > I did not check if this assumption always works, but there seems to
> > be a large overlap between idmapped aware callers and use cases
> > that will require sending events to a mount mark.
> >
> 
> The above "trick" is pretty silly as I believe Christian intends
> to fix all those call sites that pass init_user_ns.

If he does that we also should have the mountpoint there to use for
fsnotify, shouldn't we? :)

> > > Also I'm somewhat uneasy that it is random (from
> > > userspace POV) when path event is generated and when not (at least that's
> > > my impression from the patch - maybe I'm wrong). How difficult would it be
> > > to get rid of it? I mean what if we just moved say fsnotify_create() call
> > > wholly up the stack? It would mean more explicit calls to fsnotify_create()
> > > from filesystems - as far as I'm looking nfsd, overlayfs, cachefiles,
> > > ecryptfs. But that would seem to be manageable.  Also, to maintain sanity,
> >
> > 1. I don't think we can do that for all the fsnotify_create() hooks, such as
> >     debugfs for example
> > 2. It is useless to pass the mount from overlayfs to fsnotify, its a private
> >     mount that users cannot set a mark on anyway and Christian has
> >     promised to propose the same change for cachefiles and ecryptfs,
> >     so I think it's not worth the churn in those call sites
> > 3. I am uneasy with removing the fsnotify hooks from vfs helpers and
> >     trusting that new callers of vfs_create() will remember to add the high
> >     level hooks, so I prefer the existing behavior remains for such callers
> >
> 
> So I read your proposal the wrong way.
> You meant move fsnotify_create() up *without* passing mount context
> from overlayfs and friends.

Well, I was thinking that we could find appropriate mount context for
overlayfs or ecryptfs (which just shows how little I know about these
filesystems ;) I didn't think of e.g. debugfs. Anyway, if we can make
mountpoint marks work for directory events at least for most filesystems, I
think that is OK as well. However it would be then needed to detect whether
a given filesystem actually supports mount marks for dir events and if not,
report error from fanotify_mark() instead of silently not generating
events.

> So yeh, I do think it is manageable. I think the best solution would be
> something along the lines of wrappers like the following:
> 
> static inline int vfs_mkdir(...)
> {
>         int error = __vfs_mkdir_nonotify(...);
>         if (!error)
>                 fsnotify_mkdir(dir, dentry);
>         return error;
> }
> 
> And then the few call sites that call the fsnotify_path_ hooks
> (i.e. in syscalls and perhaps later in nfsd) will call the
> __vfs_xxx_nonotify() variant.

Yes, that is OK with me. Or we could have something like:

static inline void fsnotify_dirent(struct vfsmount *mnt, struct inode *dir,
				   struct dentry *dentry, __u32 mask)
{
	if (!mnt) {
		fsnotify(mask, d_inode(dentry), FSNOTIFY_EVENT_INODE, dir,
			 &dentry->d_name, NULL, 0);
	} else {
		struct path path = {
			.mnt = mnt,
			.dentry = d_find_any_alias(dir)
		};
		fsnotify(mask, d_inode(dentry), FSNOTIFY_EVENT_PATH, &path,
			 &dentry->d_name, NULL, 0);
	}
}

static inline void fsnotify_mkdir(struct vfsmount *mnt, struct inode *inode,
				  struct dentry *dentry)
{
        audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE);

        fsnotify_dirent(mnt, inode, dentry, FS_CREATE | FS_ISDIR);
}

static inline int vfs_mkdir(mnt, ...)
{
	int error = __vfs_mkdir_nonotify(...);
	if (!error)
		fsnotify_mkdir(mnt, dir, dentry);
}

And pass mnt to vfs_mkdir() for filesystems where we have it...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR