On Tue, Feb 15, 2022 at 06:24:53PM -0800, Stephen Brennan wrote: > It seems to me that, if we had taken a reference on child by > incrementing the reference count prior to unlocking it, then > dentry_unlist could never have been called, since we would never have > made it into __dentry_kill. child would still be on the list, and any > cursor (or sweep_negative) list updates would now be reflected in > child->d_child.next. But dput is definitely not safe while holding a > lock on a parent dentry (even more so now thanks to my patch), so that > is out of the question. > > Would dput_to_list be an appropriate solution to that issue? We can > maintain a dispose list in d_walk and then for any dput which really > drops the refcount to 0, we can handle them after d_walk is done. It > shouldn't be that many dentries anyway. Interesting idea, but... what happens to behaviour of e.g. shrink_dcache_parent()? You'd obviously need to modify the test in select_collect(), but then the selected dentries become likely candidates for d_walk() itself wanting to move them over to its internal shrink list. OTOH, __dput_to_list() will just decrement the count and skip the sucker if it's already on a shrink list... It might work, but it really needs a careful analysis wrt. parallel d_walk(). What happens when you have two threads hitting shrink_dcache_parent() on two different places, one being an ancestor of another? That can happen in parallel, and currently it does work correctly, but that's fairly delicate and there are places where a minor change could turn O(n) into O(n^2), etc. Let me think about that - I'm not saying it's hopeless, and it would be nice to avoid that subtlety in dentry_unlist(), but there might be dragons.