Re: [RFC][PATCH] fsnotify: optimize the case of no access event watchers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 9, 2024 at 10:24 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> On 1/9/24 1:12 PM, Jens Axboe wrote:
> > On 1/9/24 12:48 PM, Amir Goldstein wrote:
> >> Commit e43de7f0862b ("fsnotify: optimize the case of no marks of any type")
> >> optimized the case where there are no fsnotify watchers on any of the
> >> filesystem's objects.
> >>
> >> It is quite common for a system to have a single local filesystem and
> >> it is quite common for the system to have some inotify watches on some
> >> config files or directories, so the optimization of no marks at all is
> >> often not in effect.
> >>
> >> Access event watchers are far less common, so optimizing the case of
> >> no marks with access events could improve performance for more systems,
> >> especially for the performance sensitive hot io path.
> >>
> >> Maintain a per-sb counter of objects that have marks with access
> >> events in their mask and use that counter to optimize out the call to
> >> fsnotify() in fsnotify access hooks.
> >>
> >> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> >> ---
> >>
> >> Jens,
> >>
> >> You may want to try if this patch improves performance for your workload
> >> with SECURITY=Y and FANOTIFY_ACCESS_PERMISSIONS=Y.
> >
> > Ran the usual test, and this effectively removes fsnotify from the
> > profiles, which (as per other email) is between 5-6% of CPU time. So I'd
> > say it looks mighty appealing!
>
> Tried with an IRQ based workload as well, as those are always impacted
> more by the fsnotify slowness. This patch removes ~8% of useless
> overhead in that case, so even bigger win there.
>

Do the IRQ based workloads always go through io_req_io_end()?
Meaning that unlike the polled io workloads, they also incur the
overhead of the fsnotify_{modify,access}() hooks?

I remember I asked you once (cannot find where) whether
io_complete_rw_iopoll() needs the fsnotify hooks and you said that
it is a highly specialized code path for fast io, whose users will not
want those access/modify hooks.

Considering the fact that fsnotify_{modify,access}() could just as well
be bypassed by mmap() read/write, I fully support this reasoning.

Anyway, that explains (to me) why compiling-out the fsnotify_perm()
hooks took away all the regression that you observed in upstream,
because I was wondering where the overhead of fsnotify_access() was.

Jan,

What are your thoughts about this optimization patch?

My thoughts are that the optimization is clearly a win, but do we
really want to waste a full long in super_block for counting access
event watchers that may never exist?

Should we perhaps instead use a flag to say that "access watchers
existed"?

We could put s_fsnotify_access_watchers inside a struct
fsnotify_sb_mark_connector and special case alloc/free of
FSNOTIFY_OBJ_TYPE_SB connector.

Using a specialized fsnotify_sb_mark_connector will make room for
similar optimizations in the future and could be a placeholder for
the "default permission mask" that we discussed.

Thoughts?

Thanks,
Amir.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux