On Fri 04-11-22 16:33:18, Stephen Brennan wrote: > Jan Kara <jack@xxxxxxx> writes: > > On Tue 01-11-22 13:48:54, Stephen Brennan wrote: > >> Jan Kara <jack@xxxxxxx> writes: > >> > Hi Stephen! > >> > > >> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote: > >> >> Here is v3 of the patch series. I've taken all of the feedback, > >> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each > >> >> patch. > >> >> > >> >> I caught an obvious and silly dentry reference leak: d_find_any_alias() > >> >> returns a reference, which I never called dput() on. With that change, I > >> >> no longer see the rpc_pipefs issue, but I do think I need more testing > >> >> and thinking through the third patch. Al, I'd love your feedback on that > >> >> one especially. > >> >> > >> >> Thanks, > >> >> Stephen > >> >> > >> >> Stephen Brennan (3): > >> >> fsnotify: Use d_find_any_alias to get dentry associated with inode > >> >> fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem > >> >> fsnotify: allow sleepable child flag update > >> > > >> > Thanks for the patches Stephen and I'm sorry for replying somewhat late. > >> > >> Absolutely no worries, these things take time. Thanks for taking a look! > >> > >> > The first patch is a nobrainer. The other two patches ... complicate things > >> > somewhat more complicated than I'd like. I guess I can live with them if we > >> > don't find a better solution but I'd like to discuss a bit more about > >> > alternatives. > >> > >> Understood! > >> > >> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in > >> > __fsnotify_parent() for the dentry which triggered the event and does not > >> > have watched parent anymore and never bother with full children walk? I > >> > suppose your contention problems will be gone, we'll just pay the price of > >> > dget_parent() + fsnotify_inode_watches_children() for each child that > >> > falsely triggers instead of for only one. Maybe that's not too bad? After > >> > all any event upto this moment triggered this overhead as well... > >> > >> This is an interesting idea. It came across my mind but I don't think I > >> considered it seriously because I assumed that it was too big a change. > >> But I suppose in the process I created an even bigger change :P > >> > >> The false positive dget_parent() + fsnotify_inode_watches_children() > >> shouldn't be too bad. I could see a situation where there's a lot of > >> random accesses within a directory, where the dget_parent() could cause > >> some contention over the parent dentry. But to be fair, the performance > >> would have been the same or worse while fsnotify was active in that > >> case, and the contention would go away as most of the dentries get their > >> flags cleared. So I don't think this is a problem. > >> > >> > Am I missing something? > >> > >> I think there's one thing missed here. I understand you'd like to get > >> rid of the extra flag in the connector. But the advantage of the flag is > >> avoiding duplicate work by saving a bit of state. Suppose that a mark is > >> added to a connector, which causes fsnotify_inode_watches_children() to > >> become true. Then, any subsequent call to fsnotify_recalc_mask() must > >> call __fsnotify_update_child_dentry_flags(), even though the child > >> dentry flags don't need to be updated: they're already set. For (very) > >> large directories, this can take a few seconds, which means that we're > >> doing a few extra seconds of work each time a new mark is added to or > >> removed from a connector in that case. I can't imagine that's a super > >> common workload though, and I don't know if my customers do that (my > >> guess would be no). > > > > I understand. This basically matters for fsnotify_recalc_mask(). As a side > > note I've realized that your changes to fsnotify_recalc_mask() acquiring > > inode->i_rwsem for updating dentry flags in patch 2/3 are problematic for > > dnotify because that calls fsnotify_recalc_mask() under a spinlock. > > Furthermore it is somewhat worrying also for inotify & fanotify because it > > nests inode->i_rwsem inside fsnotify_group->lock however I'm not 100% sure > > something doesn't force the ordering the other way around (e.g. the removal > > of oneshot mark during modify event generation). Did you run tests with > > lockdep enabled? > > No I didn't. I'll be sure to get the LTP tests running with lockdep > early next week and try this series out, we'll probably get an error > like you say. > > I'll also take a look at the dnotify use case and see if there's > anything to do there. Hopefully there's something we can do to salvage > it :D Yeah, dnotify should be solvable. I'm even willing to accept somewhat unusual methods for dnotify ;). It is ancient and rarely used API these days so I don't want it to block fixes for newer APIs. From a quick look it should be enough to move fsnotify_recalc_mask() call in dnotify_recalc_inode_mask() from under the mark->lock out into callers and perhaps rename dnotify_recalc_inode_mask() to dnotify_recalc_mark_mask() to make the name somewhat less confusing. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR