On Thu 12-09-24 17:15:48, Yafang Shao wrote: > Commit 681ce8623567 ("vfs: Delete the associated dentry when deleting a > file") introduced an unconditional deletion of the associated dentry when a > file is removed. However, this led to performance regressions in specific > benchmarks, such as ilebench.sum_operations/s [0], prompting a revert in > commit 4a4be1ad3a6e ("Revert 'vfs: Delete the associated dentry when > deleting a file'"). > > This patch seeks to reintroduce the concept conditionally, where the > associated dentry is deleted only when the user explicitly opts for it > during file removal. > > There are practical use cases for this proactive dentry reclamation. > Besides the Elasticsearch use case mentioned in commit 681ce8623567, > additional examples have surfaced in our production environment. For > instance, in video rendering services that continuously generate temporary > files, upload them to persistent storage servers, and then delete them, a > large number of negative dentries—serving no useful purpose—accumulate. > Users in such cases would benefit from proactively reclaiming these > negative dentries. This patch provides an API allowing users to actively > delete these unnecessary negative dentries. > > Link: https://lore.kernel.org/linux-fsdevel/202405291318.4dfbb352-oliver.sang@xxxxxxxxx [0] > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > Cc: Christian Brauner <brauner@xxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> Umm, I don't think we want to burn a FMODE flag and expose these details of dentry reclaim to userspace. BTW, if we wanted to do this, we already have d_mark_dontcache() for in-kernel users which we could presumably reuse. But I would not completely give up on trying to handle this in an automated way inside the kernel. The original solution you mention above was perhaps too aggressive but maybe d_delete() could just mark the dentry with a "deleted" flag, retain_dentry() would move such dentries into a special LRU list which we'd prune once in a while (say once per 5 s). Also this list would be automatically pruned from prune_dcache_sb(). This way if there's quick reuse of a dentry, it will get reused and no harm is done, if it isn't quickly reused, we'll free them to not waste memory. What do people think about such scheme? Honza > --- > fs/dcache.c | 7 ++++++- > fs/open.c | 9 ++++++++- > include/linux/dcache.h | 2 +- > include/linux/sched.h | 2 +- > include/uapi/asm-generic/fcntl.h | 4 ++++ > 5 files changed, 20 insertions(+), 4 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index 3d8daaecb6d1..6d744b5e5a6c 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1667,7 +1667,10 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > smp_store_release(&dentry->d_name.name, dname); /* ^^^ */ > > dentry->d_lockref.count = 1; > - dentry->d_flags = 0; > + if (current->flags & PF_REMOVE_DENTRY) > + dentry->d_flags = DCACHE_FILE_REMOVE; > + else > + dentry->d_flags = 0; > spin_lock_init(&dentry->d_lock); > seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock); > dentry->d_inode = NULL; > @@ -2394,6 +2397,8 @@ void d_delete(struct dentry * dentry) > * Are we the only user? > */ > if (dentry->d_lockref.count == 1) { > + if (dentry->d_flags & DCACHE_FILE_REMOVE) > + __d_drop(dentry); > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > dentry_unlink_inode(dentry); > } else { > diff --git a/fs/open.c b/fs/open.c > index 22adbef7ecc2..3441a004a841 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -1428,7 +1428,14 @@ static long do_sys_openat2(int dfd, const char __user *filename, > long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode) > { > struct open_how how = build_open_how(flags, mode); > - return do_sys_openat2(dfd, filename, &how); > + long err; > + > + if (flags & O_NODENTRY) > + current->flags |= PF_REMOVE_DENTRY; > + err = do_sys_openat2(dfd, filename, &how); > + if (flags & O_NODENTRY) > + current->flags &= ~PF_REMOVE_DENTRY; > + return err; > } > > > diff --git a/include/linux/dcache.h b/include/linux/dcache.h > index bff956f7b2b9..82ba79bc0072 100644 > --- a/include/linux/dcache.h > +++ b/include/linux/dcache.h > @@ -215,7 +215,7 @@ struct dentry_operations { > > #define DCACHE_NOKEY_NAME BIT(25) /* Encrypted name encoded without key */ > #define DCACHE_OP_REAL BIT(26) > - > +#define DCACHE_FILE_REMOVE BIT(27) /* remove this dentry when file is removed */ > #define DCACHE_PAR_LOOKUP BIT(28) /* being looked up (with parent locked shared) */ > #define DCACHE_DENTRY_CURSOR BIT(29) > #define DCACHE_NORCU BIT(30) /* No RCU delay for freeing */ > diff --git a/include/linux/sched.h b/include/linux/sched.h > index f8d150343d42..f931a3a882e0 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1649,7 +1649,7 @@ extern struct pid *cad_pid; > #define PF_USED_MATH 0x00002000 /* If unset the fpu must be initialized before use */ > #define PF_USER_WORKER 0x00004000 /* Kernel thread cloned from userspace thread */ > #define PF_NOFREEZE 0x00008000 /* This thread should not be frozen */ > -#define PF__HOLE__00010000 0x00010000 > +#define PF_REMOVE_DENTRY 0x00010000 /* Remove the dentry when the file is removed */ > #define PF_KSWAPD 0x00020000 /* I am kswapd */ > #define PF_MEMALLOC_NOFS 0x00040000 /* All allocations inherit GFP_NOFS. See memalloc_nfs_save() */ > #define PF_MEMALLOC_NOIO 0x00080000 /* All allocations inherit GFP_NOIO. See memalloc_noio_save() */ > diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h > index 80f37a0d40d7..ca5f402d5e7d 100644 > --- a/include/uapi/asm-generic/fcntl.h > +++ b/include/uapi/asm-generic/fcntl.h > @@ -89,6 +89,10 @@ > #define __O_TMPFILE 020000000 > #endif > > +#ifndef O_NODENTRY > +#define O_NODENTRY 040000000 > +#endif > + > /* a horrid kludge trying to make sure that this will fail on old kernels */ > #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY) > > -- > 2.43.5 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR