Re: [LSF/MM TOPIC] Better handling of negative dentries

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Tue, 22 Mar 2022 15:08:33 +0000

On Wed, Mar 16, 2022 at 01:52:23PM +1100, Dave Chinner wrote:
> On Wed, Mar 16, 2022 at 10:07:19AM +0800, Gao Xiang wrote:
> > On Tue, Mar 15, 2022 at 01:56:18PM -0700, Roman Gushchin wrote:
> > > 
> > > > On Mar 15, 2022, at 12:56 PM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > > > 
> > > > The number of negative dentries is effectively constrained only by memory
> > > > size.  Systems which do not experience significant memory pressure for
> > > > an extended period can build up millions of negative dentries which
> > > > clog the dcache.  That can have different symptoms, such as inotify
> > > > taking a long time [1], high memory usage [2] and even just poor lookup
> > > > performance [3].  We've also seen problems with cgroups being pinned
> > > > by negative dentries, though I think we now reparent those dentries to
> > > > their parent cgroup instead.
> > > 
> > > Yes, it should be fixed already.
> > > 
> > > > 
> > > > We don't have a really good solution yet, and maybe some focused
> > > > brainstorming on the problem would lead to something that actually works.
> > > 
> > > I’d be happy to join this discussion. And in my opinion it’s going beyond negative dentries: there are other types of objects which tend to grow beyond any reasonable limits if there is no memory pressure.
> > 
> > +1, we once had a similar issue as well, and agree that is not only
> > limited to negative dentries but all too many LRU-ed dentries and inodes.
> 
> Yup, any discussion solely about managing buildup of negative
> dentries doesn't acknowledge that it is just a symptom of larger
> problems that need to be addressed.

Yes, but let's not make this _so_ broad a discussion that it becomes
unsolvable.  Rather, let's look for a solution to this particular problem
that can be adopted by other caches that share a similar problem.

For example, we might be seduced into saying "this is a slab problem"
because all the instances we have here allocate from slab.  But slab
doesn't have enough information to solve the problem.  Maybe the working
set of the current workload really needs 6 million dentries to perform
optimally.  Maybe it needs 600.  Slab can't know that.  Maybe slab can
play a role here, but the only component which can know the appropriate
size for a cache is the cache itself.

I think the logic needs to be in d_alloc().  Before it calls __d_alloc(),
it should check ... something ... to see if it should try to shrink
the LRU list.  The devil is in what that something should be.  I'm no
expert on the dcache; do we just want to call prune_dcache_sb() for
every 1/1000 time?  Rely on DCACHE_REFERENCED to make sure that we're
not over-pruning the list?  If so, what do we set nr_to_scan to?  1000 so
that we try to keep the dentry list the same size?  1500 so that it
actually tries to shrink?

I don't feel like I know enough to go further here.  But it feels better
than what we currently do -- calling all the shrinkers from deep in
the page allocator.