[PATCH 0/5] VFS: Directory level cache cleaning

Li Wang <liwang@xxxxxxxxxxxxxxx> · Mon, 16 Dec 2013 07:00:04 -0800

Currently, Linux only support file system wide VFS
cache (dentry cache and page cache) cleaning through
'/proc/sys/vm/drop_caches'. Sometimes this is less
flexible. The applications may know exactly whether
the metadata and data will be referenced or not in future,
a desirable mechanism is to enable applications to
reclaim the memory of unused cache entries at a finer
granularity - directory level. This enables applications
to keep hot metadata and data (to be referenced in the
future) in the cache, and kick unused out to avoid
cache thrashing. Another advantage is it is more flexible
for debugging.

This patch extend the 'drop_caches' interface to
support directory level cache cleaning and has a complete
backward compatibility. '{1,2,3}' keeps the same semantics
as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
to recursively clean the caches under DIRECTORY_PATH_NAME.
For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
will clean the page caches of the files inside 'home/foo/jpg'.

It is easy to demonstrate the advantage of directory level
cache cleaning. We use a virtual machine configured with
an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
memory.  Three directories named '1', '2' and '3' are created,
with each containing 180000 – 280000 files. The test program
opens all files in a directory and then tries the next directory.
The order for accessing the directories is '1', '2', '3',
'1'.

The time on accessing '1' on the second time is measured
with/without cache cleaning, under different file counts.
With cache cleaning, we clean all cache entries of files
in '2' before accessing the files in '3'. The results
are as follows (in seconds),

Note: by default, VFS will move those unreferenced inodes
into a global LRU list rather than freeing them, for this
experiment, we modified iput() to force to free inode as well,
this behavior and related codes are left for further discussion,
thus not reflected in this patch)

Number of files:   180000 200000 220000 240000 260000
Without cleaning:  2.165  6.977  10.032 11.571 13.443
With cleaning:     1.949  1.906  2.336  2.918  3.651

When the number of files is 180000 in each directory,
the metadata cache is large enough to buffer all entries
of three directories, so re-accessing '1' will hit in
the cache, regardless of whether '2' cleaned up or not.
As the number of files increases, the cache can now only
buffer two+ directories. Accessing '3' will result in some
entries of '1' to be evicted (due to LRU). When re-accessing '1',
some entries need be reloaded from disk, which is time-consuming.
In this case, cleaning '2' before accessing '3' enjoys a good
speedup, a maximum 4.29X performance improvements is achieved.
The advantage of directory level page cache cleaning should be 
easier to be demonstrated.

Signed-off-by: Li Wang <liwang@xxxxxxxxxxxxxxx>
Signed-off-by: Yunchuan Wen <yunchuanwen@xxxxxxxxxxxxxxx>

Li Wang (5):
  VFS: Convert drop_caches to accept string
  VFS: Convert sysctl_drop_caches to string
  VFS: Add the declaration of shrink_pagecache_parent
  VFS: Add shrink_pagecache_parent
  VFS: Extend drop_caches sysctl handler to allow directory level cache
    cleaning

 fs/dcache.c            |   35 +++++++++++++++++++++++++++++++++++
 fs/drop_caches.c       |   45 +++++++++++++++++++++++++++++++++++++--------
 include/linux/dcache.h |    1 +
 include/linux/mm.h     |    3 ++-
 kernel/sysctl.c        |    6 ++----
 5 files changed, 77 insertions(+), 13 deletions(-)

-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>