On Tue, Nov 12, 2024 at 2:55 PM Jan Kara <jack@xxxxxxx> wrote: > > On Tue 12-11-24 09:11:32, Amir Goldstein wrote: > > On Tue, Nov 12, 2024 at 1:37 AM Linus Torvalds > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Mon, 11 Nov 2024 at 16:00, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > > > I think that's a good idea for pre-content events, because it's fine > > > > to say that if the sb/mount was not watched by a pre-content event listener > > > > at the time of file open, then we do not care. > > > > > > Right. > > > > > > > The problem is that legacy inotify/fanotify watches can be added after > > > > file is open, so that is allegedly why this optimization was not done for > > > > fsnotify hooks in the past. > > > > > > So honestly, even if the legacy fsnotify hooks can't look at the file > > > flag, they could damn well look at an inode flag. > > > > Legacy fanotify has a mount watch (FAN_MARK_MOUNT), > > which is the common way for Anti-malware to set watches on > > filesystems, so I am not sure what you are saying. > > > > > And I'm not even convinced that we couldn't fix them to just look at a > > > file flag, and say "tough luck, somebody opened that file before you > > > started watching, you don't get to see what they did". > > > > That would specifically break tail -f (for inotify) and probably many other > > tools, but as long as we also look at the inode flags (i_fsnotify_mask) > > and the dentry flags (DCACHE_FSNOTIFY_PARENT_WATCHED), > > then I think we may be able to get away with changing the semantics > > for open files on a fanotify mount watch. > > Yes, I agree we cannot afford to generate FS_MODIFY event only if the mark > was placed after file open. There's too much stuff in userspace depending > on this since this behavior dates back to inotify interface sometime in > 2010 or so. > > > Specifically, I would really like to eliminate completely the cost of > > FAN_ACCESS_PERM event, which could be gated on file flag, because > > this is only for security/Anti-malware and I don't think this event is > > practically > > useful and it sure does not need to guarantee permission events to mount > > watchers on already open files. > > For traditional fanotify permission events I agree generating them only if > the mark was placed before open is likely fine but we'll have to try and > see whether something breaks. For the new pre-content events I like the > per-file flag as Linus suggested. That should indeed save us some cache > misses in some fast paths. FWIW, attached a patch that implements FMODE_NOTIFY_PERM I have asked Oliver to run his performance tests to see if we can observe an improvement with existing workloads, but is sure is going to be useful for pre-content events. For example, here is what the pre content helper looks like after I adapted Josef's patches to use the flag: static inline bool fsnotify_file_has_pre_content_watches(struct file *file) { if (!(file->f_mode & FMODE_NOTIFY_PERM)) return false; if (!(file_inode(file)->i_sb->s_iflags & SB_I_ALLOW_HSM)) return false; return fsnotify_file_object_watched(file, FSNOTIFY_PRE_CONTENT_EVENTS); } Thanks, Amir.
From 8c8e9452d153a1918470cbe52a8eb6505c675911 Mon Sep 17 00:00:00 2001 From: Amir Goldstein <amir73il@xxxxxxxxx> Date: Tue, 12 Nov 2024 13:46:08 +0100 Subject: [PATCH] fsnotify: opt-in for permission events at file_open_perm() time Legacy inotify/fanotify listeners can add watches for events on inode, parent or mount and expect to get events (e.g. FS_MODIFY) on files that were already open at the time of setting up the watches. fanotify permission events are typically used by Anti-malware sofware, that is watching the entire mount and it is not common to have more that one Anti-malware engine installed on a system. To reduce the overhead of the fsnotify_file_perm() hooks on every file access, relax the semantics of the legacy FAN_OPEN_PERM event to generate events only if there were *any* permission event listeners on the filesystem at the time that the file was open. The new semantics, implemented with the opt-in FMODE_NOTIFY_PERM flag are also going to apply to the new fanotify pre-content event in order to reduce the cost of the pre-content event vfs hooks. Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wj8L=mtcRTi=NECHMGfZQgXOp_uix1YVh04fEmrKaMnXA@xxxxxxxxxxxxxx/ Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> --- include/linux/fs.h | 3 ++- include/linux/fsnotify.h | 47 ++++++++++++++++++++++++++++------------ 2 files changed, 35 insertions(+), 15 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 9c13222362f5..9b58e9887e4b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -173,7 +173,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, #define FMODE_NOREUSE ((__force fmode_t)(1 << 23)) -/* FMODE_* bit 24 */ +/* File may generate fanotify access permission events */ +#define FMODE_NOTIFY_PERM ((__force fmode_t)(1 << 24)) /* File is embedded in backing_file object */ #define FMODE_BACKING ((__force fmode_t)(1 << 25)) diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h index 278620e063ab..f0fd3dcae654 100644 --- a/include/linux/fsnotify.h +++ b/include/linux/fsnotify.h @@ -108,10 +108,9 @@ static inline void fsnotify_dentry(struct dentry *dentry, __u32 mask) fsnotify_parent(dentry, mask, dentry, FSNOTIFY_EVENT_DENTRY); } -static inline int fsnotify_file(struct file *file, __u32 mask) +/* Should events be generated on this open file regardless of watches? */ +static inline bool fsnotify_file_watchable(struct file *file, __u32 mask) { - const struct path *path; - /* * FMODE_NONOTIFY are fds generated by fanotify itself which should not * generate new events. We also don't want to generate events for @@ -119,14 +118,37 @@ static inline int fsnotify_file(struct file *file, __u32 mask) * handle creation / destruction events and not "real" file events. */ if (file->f_mode & (FMODE_NONOTIFY | FMODE_PATH)) + return false; + + /* Permission events require that watches are set before FS_OPEN_PERM */ + if (mask & ALL_FSNOTIFY_PERM_EVENTS & ~FS_OPEN_PERM && + !(file->f_mode & FMODE_NOTIFY_PERM)) + return false; + + return true; +} + +static inline int fsnotify_file(struct file *file, __u32 mask) +{ + const struct path *path; + + if (!fsnotify_file_watchable(file, mask)) return 0; path = &file->f_path; - /* Permission events require group prio >= FSNOTIFY_PRIO_CONTENT */ - if (mask & ALL_FSNOTIFY_PERM_EVENTS && - !fsnotify_sb_has_priority_watchers(path->dentry->d_sb, - FSNOTIFY_PRIO_CONTENT)) - return 0; + /* + * Permission events require group prio >= FSNOTIFY_PRIO_CONTENT. + * Unless permission event watchers exist at FS_OPEN_PERM time, + * operations on file will not be generating any permission events. + */ + if (mask & ALL_FSNOTIFY_PERM_EVENTS) { + if (!fsnotify_sb_has_priority_watchers(path->dentry->d_sb, + FSNOTIFY_PRIO_CONTENT)) + return 0; + + if (mask & FS_OPEN_PERM) + file->f_mode |= FMODE_NOTIFY_PERM; + } return fsnotify_parent(path->dentry, mask, path, FSNOTIFY_EVENT_PATH); } @@ -166,15 +188,12 @@ static inline int fsnotify_file_perm(struct file *file, int perm_mask) */ static inline int fsnotify_open_perm(struct file *file) { - int ret; + int ret = fsnotify_file(file, FS_OPEN_PERM); - if (file->f_flags & __FMODE_EXEC) { + if (!ret && file->f_flags & __FMODE_EXEC) ret = fsnotify_file(file, FS_OPEN_EXEC_PERM); - if (ret) - return ret; - } - return fsnotify_file(file, FS_OPEN_PERM); + return ret; } #else -- 2.34.1