VFS relies on LRU-like page cache eviction algorithm to reclaim cache space, such general and simple algorithm is good regarding its application independence, and is working for normal situations. However, sometimes it does not help much for those applications which are performance sensitive or under heavy loads. Since LRU may incorrectly evict going-to-be referenced pages out, resulting in severe performance degradation due to cache thrashing. Applications have the most knowledge about the things they are doing, they can always do better if they are given a chance. This motivates to endow the applications more abilities to manipulate the page cache. Currently, Linux support file system wide cache cleaing by virtue of proc interface 'drop-caches', but it is very coarse granularity and was originally proposed for debugging. The other is to do file-level page cache cleaning through 'fadvise', however, this is sometimes less flexible and not easy to use especially in directory wide operations or under massive small-file situations. This patch extends 'fadvise' to support directory level page cache cleaning. The call to posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED) with 'fd' referring to a directory will recursively reclaim page cache entries of files inside 'fd'. For secruity concern, those inodes which the caller does not own appropriate permissions will not be manipulated. It is easy to demonstrate the advantages of directory level page cache cleaning. We use a machine with a Pentium(R) Dual-Core CPU E5800 @ 3.20GHz, and with 2GB memory. Two directories named '1' and '3' are created, with each containing X (360 - 460) files, and each file with a size of 2MB. The test scripts are as follows, The test scripts (without cache cleaning) #!/bin/bash cp -r 1 2 sync cp -r 3 4 sync time grep "data" 1/* The time on 'grep "data" 1/*' is measured with/without cache cleaning, under different file counts. With cache cleaning, we clean all cache entries of files in '2' before doing 'cp -r 3 4' by using pretty much the following two statements, fd = open("2", O_DIRECTORY, 0644); posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED); The results are as follows (in seconds), X: Number of files inside each directory X Without Cleaning With Cleaning 360 2.385 1.361 380 3.159 1.466 400 3.972 1.558 420 4.823 1.548 440 5.798 1.702 460 6.888 2.197 The page cache is not large enough to buffer all the four directories, so 'cp -r 3 4' will result in some entries of '1' to be evicted (due to LRU). When re-accessing '1', some entries need be reloaded from disk, which is time-consuming. In this case, cleaning '2' before 'cp -r 3 4' enjoys a good speedup. Li Wang (3): VFS: Add the declaration of shrink_pagecache_parent Add shrink_pagecache_parent Fadvise: Add the ability for directory level page cache cleaning fs/dcache.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/dcache.h | 1 + mm/fadvise.c | 4 ++++ 3 files changed, 41 insertions(+) -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>