Re: [PATCH] fix quadratic behavior of shrink_dcache_parent()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 09 Feb 2007 23:01:06 +0100
Miklos Szeredi <miklos@xxxxxxxxxx> wrote:

> From: Miklos Szeredi <mszeredi@xxxxxxx>
> 
> The time shrink_dcache_parent() takes, grows quadratically with the
> depth of the tree under 'parent'.  This starts to get noticable at
> about 10,000.
> 
> These kinds of depths don't occur normally, and filesystems which
> invoke shrink_dcache_parent() via d_invalidate() seem to have other
> depth dependent timings, so it's not even easy to expose this problem.
> 
> However with FUSE it's easy to create a deep tree and d_invalidate()
> will also get called.  This can make a syscall hang for a very long
> time.
> 
> This is the original discovery of the problem by Russ Cox:
> 
>   http://article.gmane.org/gmane.comp.file-systems.fuse.devel/3826

"The file system mounted on /tmp/z in the example contains 2^50
directories".   heh.

I do wonder how realistic this problem is in real life.

> The following patch fixes the quadratic behavior, by optionally
> allowing prune_dcache() to prune ancestors of a dentry in one go,
> instead of doing it one at a time.
> 
> Common code in dput() and prune_one_dentry() is extracted into a new
> helper function d_kill().
> 
> shrink_dcache_parent() as well as shrink_dcache_sb() are converted to
> use the ancestry-pruner option.  Only for shrink_dcache_memory() is
> this behavior not desirable, so it keeps using the old algorithm.
> 

I wonder if we should be setting shrink_parents=1 in
shrink_dcache_memory()?  Because we have this problem where the dentry
slabs suffer lots of internal fragmentation and we end up with whole slab
pages pinned by a single directory dentry.  I expect that if
shrink_dcache_memory() were aggressive about reaping newly-childless
directory dentries, some improvements might be realised there.

If so, we should change prune_dcache() to return the number pruned, so that
shrink_dcache_memory() can keep its arithmetic correct.  Would require some
careful testing and is out of scope for your work.


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux