v3->v4: - Remove the config option "neg_dentry_pc" to reduce system admin overload. - Allow referenced negative dentries to be recycled in the list instead of being killed in pruning. - Enable auto-tuneup of negative dentry limit to match positive dentry count. - Remove the umount racing patch but take an active reference on SB while pruning to prevent it from vanishing. - Separate out the dentry_kill() code relocation in patch 1 to a separate patch. - Move the negative dentry tracking patch in front of the limiting patch. - Decrease the default negative dentry percentage from 5% to 2%. v2->v3: - Add a faster pruning rate when the free pool is closed to depletion. - As suggested by James Bottomley, add an artificial delay waiting loop before killing a negative dentry and properly clear the DCACHE_KILL_NEGATIVE flag if killing doesn't happen. - Add a new patch to track number of negative dentries that are forcifully killed. v1->v2: - Move the new nr_negative field to the end of dentry_stat_t structure as suggested by Matthew Wilcox. - With the help of Miklos Szeredi, fix incorrect locking order in dentry_kill() by using lock_parent() instead of locking the parent's d_lock directly. - Correctly account for positive to negative dentry transitions. - Automatic pruning of negative dentries will now ignore the reference bit in negative dentries but not the regular shrinking. A rogue application can potentially create a large number of negative dentries in the system consuming most of the memory available even if memory controller is enabled to limit memory usage. This can impact performance of other applications running on the system. We have customers seeing soft lockup and unresponsive system when tearing down a container because of the large number of negative dentries accumulated during its up time that had to be cleaned up at exit time when the container's filesystem was unmounted. So we need to do something about it. This patchset introduces changes to the dcache subsystem to limit the number of negative dentries allowed to be created thus limiting the amount of memory that can be consumed by negative dentries. Patch 1 just relocates the postion of the dentry_kill() function. Patch 2 tracks the number of negative dentries present in the LRU lists and reports it in /proc/sys/fs/dentry-state. Patch 3 sets a limit on the number of negative dentries allowable as a small percentage (2%) of total system memory. So the larger the system, the more negative dentries can be allowed. Once the limit is reached, new negative dentries will be killed after use. Patch 4 enables automatic pruning of least recently used negative dentries when it is close to the limit so that we won't end up killing recently used negative dentries. Patch 5 shows the number of forced negative dentry killings in /proc/sys/fs/dentry-state. Patch 6 enables auto-tuneup of free pool negative dentry count to no more than the maximum number of positive dentries ever used. With a 4.13 based kernel, the positive & negative dentries lookup rates (lookups per second) after initial boot on a 36-core 50GB memory VM with and without the patch were as follows: Metric w/o patch with patch ------ --------- ---------- Positive dentry lookup 840269 845762 Negative dentry lookup 1903405 1962514 Negative dentry creation 6817957 6928768 The last row refers to the creation rate of 1 millions negative dentries. With 50GB of memory, 1 millions negative dentries can be created with the patched kernel without any pruning or dentry killing. Ignoring some inherent noise in the test results, there wasn't any noticeable difference in term of lookup and negative dentry creation performance with or without this patch. By creating 10 millions negative dentries, however, the performance differed. Metric w/o patch with patch ------ --------- ---------- Negative dentry creation 651663 190105 For the patched kernel, the corresponding dentry-state was: 1608833 1590416 45 0 1579878 8286952 This was expected as negative dentry creation throttling with forced dentry deletion happened in this case. Running the AIM7 high-systime workload on the same VM, the baseline performance was 186770 jobs/min. By running a single-thread rogue negative dentry creation program in the background until the patched kernel with 2% limit started throttling, the performance was 183746 jobs/min. On an unpatched kernel with memory almost exhausted and memory shrinker was kicked in, the performance was 148997 jobs/min. So the patch does protect the system from suffering significant performance degradation in case a negative dentry creation rogue program is runninig in the background. Waiman Long (6): fs/dcache: Relocate dentry_kill() after lock_parent() fs/dcache: Track & report number of negative dentries fs/dcache: Limit numbers of negative dentries fs/dcache: Enable automatic pruning of negative dentries fs/dcache: Track count of negative dentries forcibly killed fs/dcache: Autotuning of negative dentry limit fs/dcache.c | 462 +++++++++++++++++++++++++++++++++++++++++++---- include/linux/dcache.h | 8 +- include/linux/list_lru.h | 1 + mm/list_lru.c | 4 +- 4 files changed, 439 insertions(+), 36 deletions(-) -- 1.8.3.1