Re: Limit dentry cache entries

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 21 May 2013 08:53:42 +1000

On Sun, May 19, 2013 at 11:50:55PM -0400, Keyur Govande wrote:
> Hello,
> 
> We have a bunch of servers that create a lot of temp files, or check
> for the existence of non-existent files. Every such operation creates
> a dentry object and soon most of the free memory is consumed for
> 'negative' dentry entries. This behavior was observed on both CentOS
> kernel v.2.6.32-358 and Amazon Linux kernel v.3.4.43-4.
> 
> There are also some processes running that occasionally allocate large
> chunks of memory, and when this happens the kernel clears out a bunch
> of stale dentry caches. This clearing takes some time. kswapd kicks
> in, and allocations and bzero() of 4GB that normally takes <1s, takes
> 20s or more.
> 
> Because the memory needs are non-continuous but negative dentry
> generation is fairly continuous, vfs_cache_pressure doesn't help much.
> 
> The thought I had was to have a sysctl that limits the number of
> dentries per super-block (sb-max-dentry). Everytime a new dentry is
> allocated in d_alloc(), check if dentry_stat.nr_dentry exceeds (number
> of super blocks * sb-max-dentry). If yes, queue up an asynchronous
> workqueue call to prune_dcache(). Also have a separate sysctl to
> indicate by what percentage to reduce the dentry entries when this
> happens.

This request does come up every so often. There are valid reasons
for being able to control the exact size of the dentry and page
caches - I've seen a few implementations in storage appliance
vendor kernels where total control of memory usage yields a few
percent better performance of industry specific benchmarks. Indeed,
years ago I thought that capping the size of the dnetry cache was a
good idea, too.

However, the problem that I've seen with every single on of these
implementations is that the limit is carefully tuned for best all
round performance in a given set of canned workloads. When the limit
is wrong, performance tanks, and it is just about impossible to set
a limit correctly for a machine that has a changing workload.

If your problem is negative dentries building up, where do you set
the limit? Set it low enough to keep only a small number of total
dentries to keep the negative dentries down, and you'll end up
with a dentry cache that isn't big enough to hold all th dentries
needed for efficient performance with workloads that do directory
traversals. It's a two-edged sword, and most people do not have
enough knowledge to tune a knob correctly.

IOWs, the automatic sizing of the dentry cache based on memory
pressure is the correct thing to do. Capping it, or allowing it to
be capped will simply generate bug reports for strange performance
problems....

That said, keeping lots of negative dentries around until memory
pressure kicks them out is probably the wrong thing to do. Negative
dentries are an optimisation for some workloads, but they tend to
have references to negative dentries with a temporal locality that
matches the unlink time.

Perhaps we need to separately reclaim negative dentries i.e. not
wait for memory pressure to reclaim them but use some other kind of
trigger for reclamation. That doesn't cap the size of the dentry
cache, but would address the problem of negative dentry buildup....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html