Re: [RFC 1/2] fanotify: new event FAN_MODIFY_DIR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 14, 2017 at 1:02 AM, Filip Štědronský <r.lklm@xxxxxxxxxx> wrote:
> Fanotify currently does not report directory modification events
> (rename, unlink, etc.). But these events are essential for about half of
> concievable fanotify use cases, especially:
>
>   - file system indexers / desktop search tools
>   - file synchronization tools (like Dropbox, Nextcloud, etc.),
>     online backup tools

This last one is the use case of my employer, Ctera Networks.
Out of curiosity, what is the use case that you are focusing on?

>
> and pretty much any app that needs to maintain and up-to-date internal
> representation of current contents of the file system.
>
> All applications of the above classes that I'm aware of currently use
> recursive inotify watches, which do not scale (my home dir has ~200k
> directories, creating all the watches takes ~2min and eats several tens
> of MB of kernel memory).
>
> There have been many calls for such a feature, pretty much since the
> creation of fanotify in 2009:
>   * By GNOME developers:
>     https://wiki.gnome.org/BastienNocera/KernelWishlist#VFS.2C_filesystems
>   * By KDE developers:
>     http://lkml.kernel.org/r/201211011352.42476.Martin%40lichtvoll.de
>     'Better support for (desktop) file search / indexing applications'
>   * And others:
>     http://lkml.kernel.org/r/AANLkTi=owK=WZW4oNtpm5WpAZhqCQUdTR2K5gzJ_MqZ+%40mail.gmail.com
>     'Fanotify mv/rename?'
>     http://lkml.kernel.org/r/1238158043.23703.20.camel%40fatty
>     'Issues with using fanotify for a filesystem indexer'
>

Thanks for sharing this summary!
I had the feeling that all recursive inotify users are hiding in the shadows,
but was missing more concrete evidence.

> Add a new fanotify event, FAN_MODIFY_DIR, that is emitted whenever the
> contents of a directory change (a directory entry is added, removed or
> renamed). This covers all the currently missing events: rename, unlink,
> mknod, mkdir, rmdir, etc.
>
> Included with the event is a file descriptor to the modified directory
> but no further info. The userspace application is expected to rescan the
> whole directory and update its model accordingly. This needs to be done
> carefully to prevent race conditions. A cross-directory rename generates
> two FAN_MODIFY_DIR events.
>

Your approach is interesting and I am glad you shared it with us.
I do like it and it gives me an idea, I am going to prepare a branch
with a subset
of my patches, so you can try them with your userspace sample program.

In comments to your patches I am going to argue that, as a matter of fact,
you can take a small sub set of my patches and get the same functionality
of FAN_MODIFY_DIR, with code that is *less* complex then what you propose.
So please treat my comments as technical comments how FAN_MODIFY_DIR
should be implemented IMO and not as an attempt to reject an opposing proposal.

> This minimalistic approach has several advantages:
>   * Does not require changes to the fanotify userspace API or the
>     fsnotify in-kernel framework, apart from adding a new event.

About the argument of not having to change in-kernel framework,
I don't think it should be a consideration at all.
I claim that my 1st fsnotify cleanup series is an improvement of the framework
(https://github.com/amir73il/linux/commits/fsnotify_dentry)
regardless of additional functionality.
So changing the in-kernel framework by adding complexity may be bad,
but changing it by simplifying and making code more readable should not
be an argument against the "non-minimalistic" approach.

About the change to usespace API, if you strip off the *added* functionality
of super block watch from my patches, your proposal for user API looks
like this:

    fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT, \
                           FAN_MODIFY_DIR|FAN_ONDIR,

And my proposal for user API looks like this:

    fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT, \
                           FAN_MOVE|FAN_CREATE|FAN_DELETE,

You can see why my proposal for user API is not any less minimalistic
and it is fully compatible with existing intotify API for the same events.

I understand why it is hard to see this behind my added functionality,
so I will post a minimalistic branch and test program as an example.

>     Especially doesn't complicate it by adding string fields.

So the string fields in my proposal are optional.
userspace opts-in to get them by specifying the  FAN_EVENT_INFO_NAME flag:

    fanotify_init(FAN_CLOEXEC | FAN_CLASS_NOTIF |
                       FAN_EVENT_INFO_PARENT | FAN_EVENT_INFO_NAME,

If you don't specify FAN_EVENT_INFO_NAME, you can get filename events
FAN_MOVE|FAN_CREATE|FAN_DELETE without the name.

What you do get is the file descriptor of the parent. sounds familiar? ;-)

The flag FAN_EVENT_INFO_PARENT in an explicit opt-in to get parent fd
instead of victim id on specific events.
If  FAN_EVENT_INFO_PARENT is not specified in fanotify_init()
the filename events FAN_MOVE|FAN_CREATE|FAN_DELETE are masked
out of fanotify_mark().

This is important because your patchs adds FAN_MODIFY_DIR to
FAN_ALL_EVENTS (as does mine).
So old fanotify userspace code, compiled with new kernel headers, with:

    fanotify_mark(fan_fd, FAN_MARK_ADD|FAN_MARK_MOUNT,
                           FAN_ALL_EVENTS,

Will start getting your FAN_MODIFY_DIR events and those programs
might have wrong assumptions about the provided fd.

So I chose to have stronger backward compatibility, and added the opt-in
FAN_EVENT_INFO_PARENT flag.

>   * Has simple and clear semantics, even with multiple renames occurring
>     in parallel etc. In case of any inconsistencies, one can simply wait
>     for a while and rescan again.
>   * FAN_MODIFY_DIR events are easily merged in case of multiple
>     operations on a directory (e.g. deleting all files).
>

I agree those 2 are important advantages.
You could get them with my proposed API when dropping the
FAN_EVENT_INFO_NAME flag.

Thanks for pointing that out.

> Signed-off-by: Filip Štědronský <r.lkml@xxxxxxxxxx>
>
> ---
>
> An alternative might be to create wrapper functions like
> vfs_path_(rename|unlink|...). They could also take care of calling
> security_path_(rename|unlink|...), which is currently also up to
> the indvidual callers (possibly with a flag because it might not
> be always desired).
>
> An alternative was proposed by Amir Goldstein in several long series of
> patches that add superblock-scoped (as opposed to vfsmount-scoped)
> fanotify watches and specific dentry manipulation events with filenames:
>
>     http://lkml.kernel.org/r/1481984434-13293-1-git-send-email-amir73il%40gmail.com
>     http://lkml.kernel.org/r/1482247207-4424-1-git-send-email-amir73il%40gmail.com
>     http://lkml.kernel.org/r/1476126784-12520-1-git-send-email-amir73il%40gmail.com
>     http://lkml.kernel.org/r/1489411223-12081-1-git-send-email-amir73il%40gmail.com
>
> There is large but not complete overlap between that proposal and
> mine (which is was originally created over a year ago, before Amir's
> work, but never posted).
>

I am proud of the role I played to help your proposal see the day of light :-)

> I think the superblock watch idea is very interesting because it might
> in the future allow reporing fsnotify events from remote and virtual
> filesystems. So I'm posting this more as a potential source of more
> ideas for discussion, or a fallback proposal in case Amir's patches
> don't make it.
> ---
>  fs/notify/fanotify/fanotify.c    |  1 +
>  include/linux/fsnotify.h         | 17 +++++++++++++++++
>  include/linux/fsnotify_backend.h |  1 +
>  include/uapi/linux/fanotify.h    |  5 ++++-
>  4 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index bbc175d4213d..5178b06c338c 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -186,6 +186,7 @@ static int fanotify_handle_event(struct fsnotify_group *group,
>
>         BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
>         BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY);
> +       BUILD_BUG_ON(FAN_MODIFY_DIR != FS_MODIFY_DIR);
>         BUILD_BUG_ON(FAN_CLOSE_NOWRITE != FS_CLOSE_NOWRITE);
>         BUILD_BUG_ON(FAN_CLOSE_WRITE != FS_CLOSE_WRITE);
>         BUILD_BUG_ON(FAN_OPEN != FS_OPEN);
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index b43d3f5bd9ea..00fb87c975d6 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -208,6 +208,23 @@ static inline void fsnotify_modify(struct file *file)
>  }
>
>  /*
> + * fsnotify_modifydir - directory contents were changed
> + * (as a result of rename, creat, unlink, etc.)
> + */
> +static inline void fsnotify_modify_dir(struct path *path)
> +{
> +       struct inode *inode = path->dentry->d_inode;
> +       __u32 mask = FS_MODIFY_DIR;
> +
> +       if (S_ISDIR(inode->i_mode))
> +               mask |= FS_ISDIR;

It is going to be somewhat confusing for users to understand
the difference between FS_MODIFY_DIR and FS_MODIFY_DIR|FS_ISDIR.
IMO, it is much more intuitive (and compatible with intoify semantics)
when users
get specific event information, e.g. FS_DELETE and FS_DELETE|FS_ISDIR.


> +       else
> +               return;
> +
> +       fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0);
> +}
> +
> +/*
>   * fsnotify_open - file was opened
>   */
>  static inline void fsnotify_open(struct file *file)
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 487246546ebe..7751b337ec31 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -42,6 +42,7 @@
>
>  #define FS_OPEN_PERM           0x00010000      /* open event in an permission hook */
>  #define FS_ACCESS_PERM         0x00020000      /* access event in a permissions hook */
> +#define FS_MODIFY_DIR          0x00040000      /* directory changed (rename/unlink/...) */
>
>  #define FS_EXCL_UNLINK         0x04000000      /* do not send events if object is unlinked */
>  #define FS_ISDIR               0x40000000      /* event occurred against dir */
> diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> index 030508d195d3..f14e048d492a 100644
> --- a/include/uapi/linux/fanotify.h
> +++ b/include/uapi/linux/fanotify.h
> @@ -15,6 +15,8 @@
>  #define FAN_OPEN_PERM          0x00010000      /* File open in perm check */
>  #define FAN_ACCESS_PERM                0x00020000      /* File accessed in perm check */
>
> +#define FAN_MODIFY_DIR         0x00040000      /* directory changed (rename/unlink/...) */
> +
>  #define FAN_ONDIR              0x40000000      /* event occurred against dir */
>
>  #define FAN_EVENT_ON_CHILD     0x08000000      /* interested in child events */
> @@ -67,7 +69,8 @@
>  #define FAN_ALL_EVENTS (FAN_ACCESS |\
>                         FAN_MODIFY |\
>                         FAN_CLOSE |\
> -                       FAN_OPEN)
> +                       FAN_OPEN |\
> +                       FAN_MODIFY_DIR)
>
>  /*
>   * All events which require a permission response from userspace
> --
> 2.11.1
>




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux