On 1/15/24 9:11 AM, Jan Kara wrote: > On Fri 12-01-24 07:11:42, Jens Axboe wrote: >> On 1/12/24 6:58 AM, Jens Axboe wrote: >>> On 1/12/24 6:00 AM, Amir Goldstein wrote: >>>> On Fri, Jan 12, 2024 at 1:09?PM Jan Kara <jack@xxxxxxx> wrote: >>>>> >>>>> On Thu 11-01-24 17:22:33, Amir Goldstein wrote: >>>>>> Commit e43de7f0862b ("fsnotify: optimize the case of no marks of any type") >>>>>> optimized the case where there are no fsnotify watchers on any of the >>>>>> filesystem's objects. >>>>>> >>>>>> It is quite common for a system to have a single local filesystem and >>>>>> it is quite common for the system to have some inotify watches on some >>>>>> config files or directories, so the optimization of no marks at all is >>>>>> often not in effect. >>>>>> >>>>>> Content event (i.e. access,modify) watchers on sb/mount more rare, so >>>>>> optimizing the case of no sb/mount marks with content events can improve >>>>>> performance for more systems, especially for performance sensitive io >>>>>> workloads. >>>>>> >>>>>> Set a per-sb flag SB_I_CONTENT_WATCHED if sb/mount marks with content >>>>>> events in their mask exist and use that flag to optimize out the call to >>>>>> __fsnotify_parent() and fsnotify() in fsnotify access/modify hooks. >>>>>> >>>>>> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> >>>>> >>>>> ... >>>>> >>>>>> -static inline int fsnotify_file(struct file *file, __u32 mask) >>>>>> +static inline int fsnotify_path(const struct path *path, __u32 mask) >>>>>> { >>>>>> - const struct path *path; >>>>>> + struct dentry *dentry = path->dentry; >>>>>> >>>>>> - if (file->f_mode & FMODE_NONOTIFY) >>>>>> + if (!fsnotify_sb_has_watchers(dentry->d_sb)) >>>>>> return 0; >>>>>> >>>>>> - path = &file->f_path; >>>>>> + /* Optimize the likely case of sb/mount/parent not watching content */ >>>>>> + if (mask & FSNOTIFY_CONTENT_EVENTS && >>>>>> + likely(!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED)) && >>>>>> + likely(!(dentry->d_sb->s_iflags & SB_I_CONTENT_WATCHED))) { >>>>>> + /* >>>>>> + * XXX: if SB_I_CONTENT_WATCHED is not set, checking for content >>>>>> + * events in s_fsnotify_mask is redundant, but it will be needed >>>>>> + * if we use the flag FS_MNT_CONTENT_WATCHED to indicate the >>>>>> + * existence of only mount content event watchers. >>>>>> + */ >>>>>> + __u32 marks_mask = d_inode(dentry)->i_fsnotify_mask | >>>>>> + dentry->d_sb->s_fsnotify_mask; >>>>>> + >>>>>> + if (!(mask & marks_mask)) >>>>>> + return 0; >>>>>> + } >>>>> >>>>> So I'm probably missing something but how is all this patch different from: >>>>> >>>>> if (likely(!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))) { >>>>> __u32 marks_mask = d_inode(dentry)->i_fsnotify_mask | >>>>> path->mnt->mnt_fsnotify_mask | >>>> >>>> It's actually: >>>> >>>> real_mount(path->mnt)->mnt_fsnotify_mask >>>> >>>> and this requires including "internal/mount.h" in all the call sites. >>>> >>>>> dentry->d_sb->s_fsnotify_mask; >>>>> if (!(mask & marks_mask)) >>>>> return 0; >>>>> } >>>>> >>>>> I mean (mask & FSNOTIFY_CONTENT_EVENTS) is true for the frequent events >>>>> (read & write) we care about. In Jens' case >>>>> >>>>> !(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED) && >>>>> !(dentry->d_sb->s_iflags & SB_I_CONTENT_WATCHED) >>>>> >>>>> is true as otherwise we'd go right to fsnotify_parent() and so Jens >>>>> wouldn't see the performance benefit. But then with your patch you fetch >>>>> i_fsnotify_mask and s_fsnotify_mask anyway for the test so the only >>>>> difference to what I suggest above is the path->mnt->mnt_fsnotify_mask >>>>> fetch but that is equivalent to sb->s_iflags (or wherever we store that >>>>> bit) fetch? >>>>> >>>>> So that would confirm that the parent handling costs in fsnotify_parent() >>>>> is what's really making the difference and just avoiding that by checking >>>>> masks early should be enough? >>>> >>>> Can't the benefit be also related to saving a function call? >>>> >>>> Only one way to find out... >>>> >>>> Jens, >>>> >>>> Can you please test attached v3 with a non-inlined fsnotify_path() helper? >>> >>> I can run it since it doesn't take much to do that, but there's no way >>> parallel universe where saving a function call would yield those kinds >>> of wins (or have that much cost). >> >> Ran this patch, and it's better than mainline for sure, but it does have >> additional overhead that the previous version did not: >> >> +1.46% [kernel.vmlinux] [k] fsnotify_path > > So did you see any effect in IOPS or just this difference in perf profile? > Because Amir's patch took a bunch of code that was previously inlined > (thus its cost was blended with the cost of the rest of the read/write > code) and moved it to this new fsnotify_path() function so its cost is now > visible... These tests are CPU bound, but I don't recall for this one as there's a bit of a mixup with the previously reported regression for 6.8-git where we now do an extra fsnotify call per mem import. -- Jens Axboe