Re: [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 01-11-22 13:48:54, Stephen Brennan wrote:
> Jan Kara <jack@xxxxxxx> writes:
> > Hi Stephen!
> >
> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote:
> >> Here is v3 of the patch series. I've taken all of the feedback,
> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each
> >> patch.
> >> 
> >> I caught an obvious and silly dentry reference leak: d_find_any_alias()
> >> returns a reference, which I never called dput() on. With that change, I
> >> no longer see the rpc_pipefs issue, but I do think I need more testing
> >> and thinking through the third patch. Al, I'd love your feedback on that
> >> one especially.
> >> 
> >> Thanks,
> >> Stephen
> >> 
> >> Stephen Brennan (3):
> >>   fsnotify: Use d_find_any_alias to get dentry associated with inode
> >>   fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem
> >>   fsnotify: allow sleepable child flag update
> >
> > Thanks for the patches Stephen and I'm sorry for replying somewhat late.
> 
> Absolutely no worries, these things take time. Thanks for taking a look!
> 
> > The first patch is a nobrainer. The other two patches ... complicate things
> > somewhat more complicated than I'd like. I guess I can live with them if we
> > don't find a better solution but I'd like to discuss a bit more about
> > alternatives.
> 
> Understood!
> 
> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in
> > __fsnotify_parent() for the dentry which triggered the event and does not
> > have watched parent anymore and never bother with full children walk? I
> > suppose your contention problems will be gone, we'll just pay the price of
> > dget_parent() + fsnotify_inode_watches_children() for each child that
> > falsely triggers instead of for only one. Maybe that's not too bad? After
> > all any event upto this moment triggered this overhead as well...
> 
> This is an interesting idea. It came across my mind but I don't think I
> considered it seriously because I assumed that it was too big a change.
> But I suppose in the process I created an even bigger change :P
> 
> The false positive dget_parent() + fsnotify_inode_watches_children()
> shouldn't be too bad. I could see a situation where there's a lot of
> random accesses within a directory, where the dget_parent() could cause
> some contention over the parent dentry. But to be fair, the performance
> would have been the same or worse while fsnotify was active in that
> case, and the contention would go away as most of the dentries get their
> flags cleared. So I don't think this is a problem.
> 
> > Am I missing something?
> 
> I think there's one thing missed here. I understand you'd like to get
> rid of the extra flag in the connector. But the advantage of the flag is
> avoiding duplicate work by saving a bit of state. Suppose that a mark is
> added to a connector, which causes fsnotify_inode_watches_children() to
> become true. Then, any subsequent call to fsnotify_recalc_mask() must
> call __fsnotify_update_child_dentry_flags(), even though the child
> dentry flags don't need to be updated: they're already set. For (very)
> large directories, this can take a few seconds, which means that we're
> doing a few extra seconds of work each time a new mark is added to or
> removed from a connector in that case. I can't imagine that's a super
> common workload though, and I don't know if my customers do that (my
> guess would be no).

I understand. This basically matters for fsnotify_recalc_mask(). As a side
note I've realized that your changes to fsnotify_recalc_mask() acquiring
inode->i_rwsem for updating dentry flags in patch 2/3 are problematic for
dnotify because that calls fsnotify_recalc_mask() under a spinlock.
Furthermore it is somewhat worrying also for inotify & fanotify because it
nests inode->i_rwsem inside fsnotify_group->lock however I'm not 100% sure
something doesn't force the ordering the other way around (e.g. the removal
of oneshot mark during modify event generation). Did you run tests with
lockdep enabled?

Anyway, if the lock ordering issues can be solved, I suppose we can
optimize fsnotify_recalc_mask() like:

	inode_lock(inode);
	spin_lock(&conn->lock);
	oldmask = inode->i_fsnotify_mask;
	__fsnotify_recalc_mask(conn);
	newmask = inode->i_fsnotify_mask;
	spin_unlock(&conn->lock);
	if (watching children changed(oldmask, newmask))
		__fsnotify_update_child_dentry_flags(...)
	inode_unlock(inode);

And because everything is serialized by inode_lock, we don't have to worry
about inode->i_fsnotify_mask and dentry flags getting out of sync or some
mark addition returning before all children are marked for reporting
events. No need for the connector flag AFAICT.

But the locking issue needs to be resolved first in any case. I need to
think some more...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux