Hi Jan, Jan Kara <jack@xxxxxxx> writes: > Hi Stephen! > > On Thu 27-10-22 17:10:13, Stephen Brennan wrote: >> Here is v3 of the patch series. I've taken all of the feedback, >> thanks Amir, Christian, Hilf, et al. Differences are noted in each >> patch. >> >> I caught an obvious and silly dentry reference leak: d_find_any_alias() >> returns a reference, which I never called dput() on. With that change, I >> no longer see the rpc_pipefs issue, but I do think I need more testing >> and thinking through the third patch. Al, I'd love your feedback on that >> one especially. >> >> Thanks, >> Stephen >> >> Stephen Brennan (3): >> fsnotify: Use d_find_any_alias to get dentry associated with inode >> fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem >> fsnotify: allow sleepable child flag update > > Thanks for the patches Stephen and I'm sorry for replying somewhat late. Absolutely no worries, these things take time. Thanks for taking a look! > The first patch is a nobrainer. The other two patches ... complicate things > somewhat more complicated than I'd like. I guess I can live with them if we > don't find a better solution but I'd like to discuss a bit more about > alternatives. Understood! > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in > __fsnotify_parent() for the dentry which triggered the event and does not > have watched parent anymore and never bother with full children walk? I > suppose your contention problems will be gone, we'll just pay the price of > dget_parent() + fsnotify_inode_watches_children() for each child that > falsely triggers instead of for only one. Maybe that's not too bad? After > all any event upto this moment triggered this overhead as well... This is an interesting idea. It came across my mind but I don't think I considered it seriously because I assumed that it was too big a change. But I suppose in the process I created an even bigger change :P The false positive dget_parent() + fsnotify_inode_watches_children() shouldn't be too bad. I could see a situation where there's a lot of random accesses within a directory, where the dget_parent() could cause some contention over the parent dentry. But to be fair, the performance would have been the same or worse while fsnotify was active in that case, and the contention would go away as most of the dentries get their flags cleared. So I don't think this is a problem. > Am I missing something? I think there's one thing missed here. I understand you'd like to get rid of the extra flag in the connector. But the advantage of the flag is avoiding duplicate work by saving a bit of state. Suppose that a mark is added to a connector, which causes fsnotify_inode_watches_children() to become true. Then, any subsequent call to fsnotify_recalc_mask() must call __fsnotify_update_child_dentry_flags(), even though the child dentry flags don't need to be updated: they're already set. For (very) large directories, this can take a few seconds, which means that we're doing a few extra seconds of work each time a new mark is added to or removed from a connector in that case. I can't imagine that's a super common workload though, and I don't know if my customers do that (my guess would be no). For an example of a test workload where this would make a difference: one of my test cases is to create a directory with millions of negative dentries, and then run "for i in {1..20}; do inotifywait $DIR & done". With the series as-is, only the first task needs to do the child flag update. With your proposed alternative, each task would re-do the update. If we were to manage to get __fsnotify_update_child_dentry_flags() to work correctly, and safely, with the ability to drop the d_lock and sleep, then I think that wouldn't be too much of a problem, because then other spinlock users of the directory will get a chance to grab it, and so we don't risk softlockups. Without the sleepable iterations, it would be marginally worse than the current solution, but I can't really comment _how_ much worse because like I said, it doesn't sound like a frequent usage pattern. I think I have a _slight_ preference for the current design, but I see the appeal of simpler code, and your design would still improve things a lot! Maybe Amir has an opinion too, but of course I'll defer to what you want. Thanks, Stephen