On Tue 09-01-24 21:48:18, Amir Goldstein wrote: > Commit e43de7f0862b ("fsnotify: optimize the case of no marks of any type") > optimized the case where there are no fsnotify watchers on any of the > filesystem's objects. > > It is quite common for a system to have a single local filesystem and > it is quite common for the system to have some inotify watches on some > config files or directories, so the optimization of no marks at all is > often not in effect. I agree. > Access event watchers are far less common, so optimizing the case of > no marks with access events could improve performance for more systems, > especially for the performance sensitive hot io path. > > Maintain a per-sb counter of objects that have marks with access > events in their mask and use that counter to optimize out the call to > fsnotify() in fsnotify access hooks. > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> I'm not saying no to this but let's discuss first before hacking in some partial solution :). AFAIU what Jens sees is something similar as was reported for example here [1]. In these cases we go through fsnotify_parent() down to fsnotify() until the check: if (!(test_mask & marks_mask)) return 0; there. And this is all relatively cheap (pure fetches like sb->s_fsnotify_marks, mnt->mnt_fsnotify_marks, inode->i_fsnotify_marks, sb->s_fsnotify_mask, mnt->mnt_fsnotify_mask, inode->i_fsnotify_mask) except for parent handling in __fsnotify_parent(). That requires multiple atomic operations - take_dentry_name_snapshot() is lock & unlock of d_lock, dget() is cmpxchg on d_lockref, dput() is another cmpxchg on d_lockref - and this gets really expensive, even more so if multiple threads race for the same parent dentry. So I think what we ideally need is a way to avoid this expensive "stabilize parent & its name" game unless we are pretty sure we are going to generate the event. There's no way to avoid fetching the parent early if dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED (we can still postpone the take_dentry_name_snapshot() cost though). However for the case where dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED == 0 (and this is the case here AFAICT) we need the parent only after the check of 'test_mask & marks_mask'. So in that case we could completely avoid the extra cost of parent dentry handling. So in principle I believe we could make fsnotify noticeably lighter for the case where no event is generated. Just the question is how to refactor the current set of functions to achieve this without creating an unmaintainable mess. I suspect if we lifted creation & some prefilling of iter_info into __fsnotify_parent() and then fill in the parent in case we need it for reporting only in fsnotify(), the code could be reasonably readable. We'd need to always propagate the dentry down to fsnotify() though, currently we often propagate only the inode because the dentry may be (still in case of create or already in case of unlink) negative. Similarly we'd somehow need to communicate down into fsnotify() whether the passed name needs snapshotting or not... What do you think? Honza [1] https://lore.kernel.org/all/SJ0PR08MB6494F5A32B7C60A5AD8B33C2AB09A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR