> > > didn't look too closely at your series as I'm not familiar with fsnotify > > > in general. However, at a glance it looks like fsnotify_parent() executes > > > a substantial amount of code even if there are no watchers but I could > > > be wrong. > > > > > > > I don't about substantial, I would say it is on par with the amount of > > code that you tries to optimize out of fsnotify(). > > > > Before bailing out with DCACHE_FSNOTIFY_PARENT_WATCHED > > test, it also references d_inode->i_sb, real_mount(path->mnt) > > and fetches all their ->x_fsnotify_mask fields. > > > > I changed the call pattern from open/modify/... hooks from: > > fsnotify_parent(...); > > fsnotify(...); > > > > to: > > fsnotify_parent(...); /* which calls fsnotify() */ > > > > So the NULL marks optimization could be done in beginning of > > fsnotify_parent() and it will be just as effective as it is in fsnotify(). > > > > Something like that may be required because > > 5.7.0 5.7.0 5.7.0 5.7.0 > vanilla fastfsnotify-v1r1 fastfsnotify-v2r1 amir-20200608 > Amean 1 0.4837 ( 0.00%) 0.4630 * 4.27%* 0.4597 * 4.96%* 0.4967 * -2.69%* > Amean 3 1.5447 ( 0.00%) 1.4557 ( 5.76%) 1.5310 ( 0.88%) 1.6587 * -7.38%* > Amean 5 2.6037 ( 0.00%) 2.4363 ( 6.43%) 2.4237 ( 6.91%) 2.6400 ( -1.40%) > Amean 7 3.5987 ( 0.00%) 3.4757 ( 3.42%) 3.6543 ( -1.55%) 3.9040 * -8.48%* > Amean 12 5.8267 ( 0.00%) 5.6983 ( 2.20%) 5.5903 ( 4.06%) 6.2593 ( -7.43%) > Amean 18 8.4400 ( 0.00%) 8.1327 ( 3.64%) 7.7150 * 8.59%* 8.9940 ( -6.56%) > Amean 24 11.0187 ( 0.00%) 10.0290 * 8.98%* 9.8977 * 10.17%* 11.7247 * -6.41%* > Amean 30 13.1013 ( 0.00%) 12.8510 ( 1.91%) 12.2087 * 6.81%* 14.0290 * -7.08%* > Amean 32 13.9190 ( 0.00%) 13.2410 ( 4.87%) 13.2900 ( 4.52%) 14.7140 * -5.71%* > > vanilla and fastnotify-v1r1 are the same. fastfsnotify-v2r1 is just the > fsnotify_parent() change which is mostly worse and may indicate that the > first patch was reasonable. amir-20200608 is your branch as of today and > it appears to introduce a substantial regression albeit in an extreme case > where fsnotify overhead is visible. The regressions are mostly larger > than noise with the caveat it may be machine specific given that the > machine is overloaded. I accept that adding extra functional to fsnotify > may be desirable but ideally it would not hurt the case where there are > no watchers at all. > Of course. And thanks for catching this regression even before I posted the patches :-) > So what's the right way forward? The patch as-is even though the fsnotify() > change itself may be marginal, a patch that just inlines the fast path > of fsnotify_parent or wait for the additional functionality and try and > address the overhead on top? > > Let me add your optimizations on top of my branch with the needed adaptations and send you a branch for testing. Thanks, Amir.