Our applications, built on Elasticsearch[0], frequently create and delete files. These applications operate within containers, some with a memory limit exceeding 100GB. Over prolonged periods, the accumulation of negative dentries within these containers can amount to tens of gigabytes. Upon container exit, directories are deleted. However, due to the numerous associated dentries, this process can be time-consuming. Our users have expressed frustration with this prolonged exit duration, which constitutes our first issue. Simultaneously, other processes may attempt to access the parent directory of the Elasticsearch directories. Since the task responsible for deleting the dentries holds the inode lock, processes attempting directory lookup experience significant delays. This issue, our second problem, is easily demonstrated: - Task 1 generates negative dentries: $ pwd ~/test $ mkdir es && cd es/ && ./create_and_delete_files.sh [ After generating tens of GB dentries ] $ cd ~/test && rm -rf es [ It will take a long duration to finish ] - Task 2 attempts to lookup the 'test/' directory $ pwd ~/test $ ls The 'ls' command in Task 2 experiences prolonged execution as Task 1 is deleting the dentries. We've devised a solution to address both issues by deleting associated dentry when removing a file. Interestingly, we've noted that a similar patch was proposed years ago[1], although it was rejected citing the absence of tangible issues caused by negative dentries. Given our current challenges, we're resubmitting the proposal. All relevant stakeholders from previous discussions have been included for reference. [0]. https://github.com/elastic/elasticsearch [1]. https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@xxxxxxxxxx Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> Cc: Wangkai <wangkai86@xxxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Colin Walters <walters@xxxxxxxxxx> Cc: Waiman Long <longman@xxxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> --- fs/dcache.c | 4 ++++ include/linux/dcache.h | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 71a8e943a0fa..4b97f60f0e64 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -701,6 +701,9 @@ static inline bool retain_dentry(struct dentry *dentry, bool locked) if (unlikely(d_flags & DCACHE_DONTCACHE)) return false; + if (unlikely(dentry->d_flags & DCACHE_FILE_DELETED)) + return false; + // At this point it looks like we ought to keep it. We also might // need to do something - put it on LRU if it wasn't there already // and mark it referenced if it was on LRU, but not marked yet. @@ -2392,6 +2395,7 @@ void d_delete(struct dentry * dentry) spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); } + dentry->d_flags |= DCACHE_FILE_DELETED; } EXPORT_SYMBOL(d_delete); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index bf53e3894aae..55a69682918c 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -210,7 +210,7 @@ struct dentry_operations { #define DCACHE_NOKEY_NAME BIT(25) /* Encrypted name encoded without key */ #define DCACHE_OP_REAL BIT(26) - +#define DCACHE_FILE_DELETED BIT(27) /* File is deleted */ #define DCACHE_PAR_LOOKUP BIT(28) /* being looked up (with parent locked shared) */ #define DCACHE_DENTRY_CURSOR BIT(29) #define DCACHE_NORCU BIT(30) /* No RCU delay for freeing */ -- 2.39.1