On Tue, Mar 14, 2017 at 1:02 AM, Filip Štědronský <r.lklm@xxxxxxxxxx> wrote: > Fanotify currently does not report directory modification events > (rename, unlink, etc.). But these events are essential for about half of > concievable fanotify use cases, especially: > > - file system indexers / desktop search tools > - file synchronization tools (like Dropbox, Nextcloud, etc.), > online backup tools This last one is the use case of my employer, Ctera Networks. Out of curiosity, what is the use case that you are focusing on? > > and pretty much any app that needs to maintain and up-to-date internal > representation of current contents of the file system. > > All applications of the above classes that I'm aware of currently use > recursive inotify watches, which do not scale (my home dir has ~200k > directories, creating all the watches takes ~2min and eats several tens > of MB of kernel memory). > > There have been many calls for such a feature, pretty much since the > creation of fanotify in 2009: > * By GNOME developers: > https://wiki.gnome.org/BastienNocera/KernelWishlist#VFS.2C_filesystems > * By KDE developers: > http://lkml.kernel.org/r/201211011352.42476.Martin%40lichtvoll.de > 'Better support for (desktop) file search / indexing applications' > * And others: > http://lkml.kernel.org/r/AANLkTi=owK=WZW4oNtpm5WpAZhqCQUdTR2K5gzJ_MqZ+%40mail.gmail.com > 'Fanotify mv/rename?' > http://lkml.kernel.org/r/1238158043.23703.20.camel%40fatty > 'Issues with using fanotify for a filesystem indexer' > Thanks for sharing this summary! I had the feeling that all recursive inotify users are hiding in the shadows, but was missing more concrete evidence. > Add a new fanotify event, FAN_MODIFY_DIR, that is emitted whenever the > contents of a directory change (a directory entry is added, removed or > renamed). This covers all the currently missing events: rename, unlink, > mknod, mkdir, rmdir, etc. > > Included with the event is a file descriptor to the modified directory > but no further info. The userspace application is expected to rescan the > whole directory and update its model accordingly. This needs to be done > carefully to prevent race conditions. A cross-directory rename generates > two FAN_MODIFY_DIR events. > Your approach is interesting and I am glad you shared it with us. I do like it and it gives me an idea, I am going to prepare a branch with a subset of my patches, so you can try them with your userspace sample program. In comments to your patches I am going to argue that, as a matter of fact, you can take a small sub set of my patches and get the same functionality of FAN_MODIFY_DIR, with code that is *less* complex then what you propose. So please treat my comments as technical comments how FAN_MODIFY_DIR should be implemented IMO and not as an attempt to reject an opposing proposal. > This minimalistic approach has several advantages: > * Does not require changes to the fanotify userspace API or the > fsnotify in-kernel framework, apart from adding a new event. About the argument of not having to change in-kernel framework, I don't think it should be a consideration at all. I claim that my 1st fsnotify cleanup series is an improvement of the framework (https://github.com/amir73il/linux/commits/fsnotify_dentry) regardless of additional functionality. So changing the in-kernel framework by adding complexity may be bad, but changing it by simplifying and making code more readable should not be an argument against the "non-minimalistic" approach. About the change to usespace API, if you strip off the *added* functionality of super block watch from my patches, your proposal for user API looks like this: fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT, \ FAN_MODIFY_DIR|FAN_ONDIR, And my proposal for user API looks like this: fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT, \ FAN_MOVE|FAN_CREATE|FAN_DELETE, You can see why my proposal for user API is not any less minimalistic and it is fully compatible with existing intotify API for the same events. I understand why it is hard to see this behind my added functionality, so I will post a minimalistic branch and test program as an example. > Especially doesn't complicate it by adding string fields. So the string fields in my proposal are optional. userspace opts-in to get them by specifying the FAN_EVENT_INFO_NAME flag: fanotify_init(FAN_CLOEXEC | FAN_CLASS_NOTIF | FAN_EVENT_INFO_PARENT | FAN_EVENT_INFO_NAME, If you don't specify FAN_EVENT_INFO_NAME, you can get filename events FAN_MOVE|FAN_CREATE|FAN_DELETE without the name. What you do get is the file descriptor of the parent. sounds familiar? ;-) The flag FAN_EVENT_INFO_PARENT in an explicit opt-in to get parent fd instead of victim id on specific events. If FAN_EVENT_INFO_PARENT is not specified in fanotify_init() the filename events FAN_MOVE|FAN_CREATE|FAN_DELETE are masked out of fanotify_mark(). This is important because your patchs adds FAN_MODIFY_DIR to FAN_ALL_EVENTS (as does mine). So old fanotify userspace code, compiled with new kernel headers, with: fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT, FAN_ALL_EVENTS, Will start getting your FAN_MODIFY_DIR events and those programs might have wrong assumptions about the provided fd. So I chose to have stronger backward compatibility, and added the opt-in FAN_EVENT_INFO_PARENT flag. > * Has simple and clear semantics, even with multiple renames occurring > in parallel etc. In case of any inconsistencies, one can simply wait > for a while and rescan again. > * FAN_MODIFY_DIR events are easily merged in case of multiple > operations on a directory (e.g. deleting all files). > I agree those 2 are important advantages. You could get them with my proposed API when dropping the FAN_EVENT_INFO_NAME flag. Thanks for pointing that out. > Signed-off-by: Filip Štědronský <r.lkml@xxxxxxxxxx> > > --- > > An alternative might be to create wrapper functions like > vfs_path_(rename|unlink|...). They could also take care of calling > security_path_(rename|unlink|...), which is currently also up to > the indvidual callers (possibly with a flag because it might not > be always desired). > > An alternative was proposed by Amir Goldstein in several long series of > patches that add superblock-scoped (as opposed to vfsmount-scoped) > fanotify watches and specific dentry manipulation events with filenames: > > http://lkml.kernel.org/r/1481984434-13293-1-git-send-email-amir73il%40gmail.com > http://lkml.kernel.org/r/1482247207-4424-1-git-send-email-amir73il%40gmail.com > http://lkml.kernel.org/r/1476126784-12520-1-git-send-email-amir73il%40gmail.com > http://lkml.kernel.org/r/1489411223-12081-1-git-send-email-amir73il%40gmail.com > > There is large but not complete overlap between that proposal and > mine (which is was originally created over a year ago, before Amir's > work, but never posted). > I am proud of the role I played to help your proposal see the day of light :-) > I think the superblock watch idea is very interesting because it might > in the future allow reporing fsnotify events from remote and virtual > filesystems. So I'm posting this more as a potential source of more > ideas for discussion, or a fallback proposal in case Amir's patches > don't make it. > --- > fs/notify/fanotify/fanotify.c | 1 + > include/linux/fsnotify.h | 17 +++++++++++++++++ > include/linux/fsnotify_backend.h | 1 + > include/uapi/linux/fanotify.h | 5 ++++- > 4 files changed, 23 insertions(+), 1 deletion(-) > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > index bbc175d4213d..5178b06c338c 100644 > --- a/fs/notify/fanotify/fanotify.c > +++ b/fs/notify/fanotify/fanotify.c > @@ -186,6 +186,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, > > BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS); > BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY); > + BUILD_BUG_ON(FAN_MODIFY_DIR != FS_MODIFY_DIR); > BUILD_BUG_ON(FAN_CLOSE_NOWRITE != FS_CLOSE_NOWRITE); > BUILD_BUG_ON(FAN_CLOSE_WRITE != FS_CLOSE_WRITE); > BUILD_BUG_ON(FAN_OPEN != FS_OPEN); > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h > index b43d3f5bd9ea..00fb87c975d6 100644 > --- a/include/linux/fsnotify.h > +++ b/include/linux/fsnotify.h > @@ -208,6 +208,23 @@ static inline void fsnotify_modify(struct file *file) > } > > /* > + * fsnotify_modifydir - directory contents were changed > + * (as a result of rename, creat, unlink, etc.) > + */ > +static inline void fsnotify_modify_dir(struct path *path) > +{ > + struct inode *inode = path->dentry->d_inode; > + __u32 mask = FS_MODIFY_DIR; > + > + if (S_ISDIR(inode->i_mode)) > + mask |= FS_ISDIR; It is going to be somewhat confusing for users to understand the difference between FS_MODIFY_DIR and FS_MODIFY_DIR|FS_ISDIR. IMO, it is much more intuitive (and compatible with intoify semantics) when users get specific event information, e.g. FS_DELETE and FS_DELETE|FS_ISDIR. > + else > + return; > + > + fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0); > +} > + > +/* > * fsnotify_open - file was opened > */ > static inline void fsnotify_open(struct file *file) > diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h > index 487246546ebe..7751b337ec31 100644 > --- a/include/linux/fsnotify_backend.h > +++ b/include/linux/fsnotify_backend.h > @@ -42,6 +42,7 @@ > > #define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */ > #define FS_ACCESS_PERM 0x00020000 /* access event in a permissions hook */ > +#define FS_MODIFY_DIR 0x00040000 /* directory changed (rename/unlink/...) */ > > #define FS_EXCL_UNLINK 0x04000000 /* do not send events if object is unlinked */ > #define FS_ISDIR 0x40000000 /* event occurred against dir */ > diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h > index 030508d195d3..f14e048d492a 100644 > --- a/include/uapi/linux/fanotify.h > +++ b/include/uapi/linux/fanotify.h > @@ -15,6 +15,8 @@ > #define FAN_OPEN_PERM 0x00010000 /* File open in perm check */ > #define FAN_ACCESS_PERM 0x00020000 /* File accessed in perm check */ > > +#define FAN_MODIFY_DIR 0x00040000 /* directory changed (rename/unlink/...) */ > + > #define FAN_ONDIR 0x40000000 /* event occurred against dir */ > > #define FAN_EVENT_ON_CHILD 0x08000000 /* interested in child events */ > @@ -67,7 +69,8 @@ > #define FAN_ALL_EVENTS (FAN_ACCESS |\ > FAN_MODIFY |\ > FAN_CLOSE |\ > - FAN_OPEN) > + FAN_OPEN |\ > + FAN_MODIFY_DIR) > > /* > * All events which require a permission response from userspace > -- > 2.11.1 >