Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Mon, 16 Jul 2018 16:40:32 -0700

On Mon, 16 Jul 2018 05:41:15 -0700 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> On Mon, Jul 16, 2018 at 11:09:01AM +0200, Michal Hocko wrote:
> > On Fri 13-07-18 10:36:14, Dave Chinner wrote:
> > [...]
> > > By limiting the number of negative dentries in this case, internal
> > > slab fragmentation is reduced such that reclaim cost never gets out
> > > of control. While it appears to "fix" the symptoms, it doesn't
> > > address the underlying problem. It is a partial solution at best but
> > > at worst it's another opaque knob that nobody knows how or when to
> > > tune.
> > 
> > Would it help to put all the negative dentries into its own slab cache?
> 
> Maybe the dcache should be more sensitive to its own needs.  In __d_alloc,
> it could check whether there are a high proportion of negative dentries
> and start recycling some existing negative dentries.

Well, yes.

The proposed patchset adds all this background reclaiming.  Problem is
a) that background reclaiming sometimes can't keep up so a synchronous
direct-reclaim was added on top and b) reclaiming dentries in the
background will cause non-dentry-allocating tasks to suffer because of
activity from the dentry-allocating tasks, which is inappropriate.

I expect a better design is something like

__d_alloc()
{
	...
	while (too many dentries)
		call the dcache shrinker
	...
}

and that's it.  This way we have a hard upper limit and only the tasks
which are creating dentries suffer the cost.

Regarding the slab page fragmentation issue: I'm wondering if the whole
idea of balancing the slab scan rates against the page scan rates isn't
really working out.  Maybe shrink_slab() should be sitting there
hammering the caches until they have freed up a particular number of
pages.  Quite a big change, conceptually and implementationally.

Aside: about a billion years ago we were having issues with processes
getting stuck in direct reclaim because other processes were coming in
and stealing away the pages which the direct-reclaimer had just freed. 
One possible solution to that was to make direct-reclaiming tasks
release the freed pages into a list on the task_struct.  So those pages
were invisible to other allocating tasks and were available to the
direct-reclaimer when it returned from the reclaim effort.  I forget
what happened to this.

It's quite a small code change and would provide a mechanism for
implementing the hammer-cache-until-youve-freed-enough design above.

Aside 2: if we *do* do something like the above __d_alloc() pseudo code
then perhaps it could be cast in terms of pages, not dentries.  ie,

__d_alloc()
{
	...
	while (too many pages in dentry_cache)
		call the dcache shrinker
	...
}

and, apart from the external name thing (grr), that should address
these fragmentation issues, no?  I assume it's easy to ask slab how
many pages are presently in use for a particular cache.