On Thu, Apr 1, 2021 at 1:29 PM Jan Kara <jack@xxxxxxx> wrote: > > On Wed 31-03-21 23:59:27, Amir Goldstein wrote: > > On Wed, Mar 31, 2021 at 5:06 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > > > As long as "exp_export: export of idmapped mounts not yet supported.\n" > > > > > I don't think it matters much. > > > > > It feels like adding idmapped mounts to nfsd is on your roadmap. > > > > > When you get to that we can discuss adding fsnotify path hooks to nfsd > > > > > if Jan agrees to the fsnotify path hooks concept. > > > > > > > > I was looking at the patch and thinking about it for a few days already. I > > > > think that generating fsnotify event later (higher up the stack where we > > > > have mount information) is fine and a neat idea. I just dislike the hackery > > > > with dentry flags. > > > > > > Me as well. I used this hack for fast POC. > > > > > > If we stick with the dual hooks approach, we will have to either pass a new > > > argument to vfs helpers or use another trick: > > > > > > Convert all the many calls sites that were converted by Christian to: > > > vfs_XXX(&init_user_ns, ... > > > because they do not have mount context, to: > > > vfs_XXX(NULL, ... > > > > > > Inside the vfs helpers, use init_user_ns when mnt_userns is NULL, > > > but pass the original mnt_userns argument to fsnotify_ns_XXX hooks. > > > A non-NULL mnt_userns arg means "path_notify" context. > > > I have already POC code for passing mnt_userns to fsnotify hooks [1]. > > > > > > I did not check if this assumption always works, but there seems to > > > be a large overlap between idmapped aware callers and use cases > > > that will require sending events to a mount mark. > > > > > > > The above "trick" is pretty silly as I believe Christian intends > > to fix all those call sites that pass init_user_ns. > > If he does that we also should have the mountpoint there to use for > fsnotify, shouldn't we? :) > Yes, but that's not going to be hard for us anyway. nfsd has mount context available via fhp for any access and for overlayfs/ecryptfs we don't want the mount mark event. I will explain why... > > > > Also I'm somewhat uneasy that it is random (from > > > > userspace POV) when path event is generated and when not (at least that's > > > > my impression from the patch - maybe I'm wrong). How difficult would it be > > > > to get rid of it? I mean what if we just moved say fsnotify_create() call > > > > wholly up the stack? It would mean more explicit calls to fsnotify_create() > > > > from filesystems - as far as I'm looking nfsd, overlayfs, cachefiles, > > > > ecryptfs. But that would seem to be manageable. Also, to maintain sanity, > > > > > > 1. I don't think we can do that for all the fsnotify_create() hooks, such as > > > debugfs for example > > > 2. It is useless to pass the mount from overlayfs to fsnotify, its a private > > > mount that users cannot set a mark on anyway and Christian has > > > promised to propose the same change for cachefiles and ecryptfs, > > > so I think it's not worth the churn in those call sites > > > 3. I am uneasy with removing the fsnotify hooks from vfs helpers and > > > trusting that new callers of vfs_create() will remember to add the high > > > level hooks, so I prefer the existing behavior remains for such callers > > > > > > > So I read your proposal the wrong way. > > You meant move fsnotify_create() up *without* passing mount context > > from overlayfs and friends. > > Well, I was thinking that we could find appropriate mount context for > overlayfs or ecryptfs (which just shows how little I know about these > filesystems ;) I didn't think of e.g. debugfs. Anyway, if we can make > mountpoint marks work for directory events at least for most filesystems, I > think that is OK as well. However it would be then needed to detect whether > a given filesystem actually supports mount marks for dir events and if not, > report error from fanotify_mark() instead of silently not generating > events. > It's not about "filesystems that support mount marks". mount marks will work perfectly well on overlayfs. The thing is if you place a mount mark on the underlying store of overlayfs (say xfs) and then files are created/deleted by the overlayfs driver (in xfs) you wont get any events, because overlayfs uses a private mount clone to perform underlying operations. So while we CAN get the overlayfs underlying layer mount context it is irrelevant because no user can setup a mount mark on that private mount, so no need to bother calling the path hooks. This is not the case with nfsd IMO. With nfsd, when "exporting" a path to clients, nfsd is really exporting a specific mount (and keeping that mount busy too). It can even export whole mount topologies. But then again, getting the mount context in every nfsd operation is easy, there is an export context to client requests and the export context has the exported path. Therefore, nfsd is my only user using the vfs helpers that is expected to call the fsnotify path hooks (other than syscalls). > > So yeh, I do think it is manageable. I think the best solution would be > > something along the lines of wrappers like the following: > > > > static inline int vfs_mkdir(...) > > { > > int error = __vfs_mkdir_nonotify(...); > > if (!error) > > fsnotify_mkdir(dir, dentry); > > return error; > > } > > > > And then the few call sites that call the fsnotify_path_ hooks > > (i.e. in syscalls and perhaps later in nfsd) will call the > > __vfs_xxx_nonotify() variant. > > Yes, that is OK with me. Or we could have something like: > > static inline void fsnotify_dirent(struct vfsmount *mnt, struct inode *dir, > struct dentry *dentry, __u32 mask) > { > if (!mnt) { > fsnotify(mask, d_inode(dentry), FSNOTIFY_EVENT_INODE, dir, > &dentry->d_name, NULL, 0); > } else { > struct path path = { > .mnt = mnt, > .dentry = d_find_any_alias(dir) > }; > fsnotify(mask, d_inode(dentry), FSNOTIFY_EVENT_PATH, &path, > &dentry->d_name, NULL, 0); > } > } > > static inline void fsnotify_mkdir(struct vfsmount *mnt, struct inode *inode, > struct dentry *dentry) > { > audit_inode_child(inode, dentry, AUDIT_TYPE_CHILD_CREATE); > > fsnotify_dirent(mnt, inode, dentry, FS_CREATE | FS_ISDIR); > } > > static inline int vfs_mkdir(mnt, ...) > { > int error = __vfs_mkdir_nonotify(...); > if (!error) > fsnotify_mkdir(mnt, dir, dentry); > } > I've done something similar to that. I think it's a bit cleaner, but we can debate on the details later. Pushed POC to branch fsnotify_path_hooks. At the moment, create, delete, move and move_self are supported for syscalls and helpers are ready for nfsd. The method I used for rename hook is a bit different than for other hooks, because other hooks are very easy to open code while rename is complex so I create a helper for nfsd to call. Thanks, Amir.