Re: [PATCH RFC] vfs: Introduce a new open flag to imply dentry deletion on file removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 12-09-24 17:15:48, Yafang Shao wrote:
> Commit 681ce8623567 ("vfs: Delete the associated dentry when deleting a
> file") introduced an unconditional deletion of the associated dentry when a
> file is removed. However, this led to performance regressions in specific
> benchmarks, such as ilebench.sum_operations/s [0], prompting a revert in
> commit 4a4be1ad3a6e ("Revert 'vfs: Delete the associated dentry when
> deleting a file'").
> 
> This patch seeks to reintroduce the concept conditionally, where the
> associated dentry is deleted only when the user explicitly opts for it
> during file removal.
> 
> There are practical use cases for this proactive dentry reclamation.
> Besides the Elasticsearch use case mentioned in commit 681ce8623567,
> additional examples have surfaced in our production environment. For
> instance, in video rendering services that continuously generate temporary
> files, upload them to persistent storage servers, and then delete them, a
> large number of negative dentries—serving no useful purpose—accumulate.
> Users in such cases would benefit from proactively reclaiming these
> negative dentries. This patch provides an API allowing users to actively
> delete these unnecessary negative dentries.
> 
> Link: https://lore.kernel.org/linux-fsdevel/202405291318.4dfbb352-oliver.sang@xxxxxxxxx [0]
> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Cc: Christian Brauner <brauner@xxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>

Umm, I don't think we want to burn a FMODE flag and expose these details of
dentry reclaim to userspace. BTW, if we wanted to do this, we already have
d_mark_dontcache() for in-kernel users which we could presumably reuse.

But I would not completely give up on trying to handle this in an automated
way inside the kernel. The original solution you mention above was perhaps
too aggressive but maybe d_delete() could just mark the dentry with a
"deleted" flag, retain_dentry() would move such dentries into a special LRU
list which we'd prune once in a while (say once per 5 s). Also this list
would be automatically pruned from prune_dcache_sb(). This way if there's
quick reuse of a dentry, it will get reused and no harm is done, if it
isn't quickly reused, we'll free them to not waste memory.

What do people think about such scheme?

								Honza

> ---
>  fs/dcache.c                      | 7 ++++++-
>  fs/open.c                        | 9 ++++++++-
>  include/linux/dcache.h           | 2 +-
>  include/linux/sched.h            | 2 +-
>  include/uapi/asm-generic/fcntl.h | 4 ++++
>  5 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 3d8daaecb6d1..6d744b5e5a6c 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1667,7 +1667,10 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
>  	smp_store_release(&dentry->d_name.name, dname); /* ^^^ */
>  
>  	dentry->d_lockref.count = 1;
> -	dentry->d_flags = 0;
> +	if (current->flags & PF_REMOVE_DENTRY)
> +		dentry->d_flags = DCACHE_FILE_REMOVE;
> +	else
> +		dentry->d_flags = 0;
>  	spin_lock_init(&dentry->d_lock);
>  	seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock);
>  	dentry->d_inode = NULL;
> @@ -2394,6 +2397,8 @@ void d_delete(struct dentry * dentry)
>  	 * Are we the only user?
>  	 */
>  	if (dentry->d_lockref.count == 1) {
> +		if (dentry->d_flags & DCACHE_FILE_REMOVE)
> +			__d_drop(dentry);
>  		dentry->d_flags &= ~DCACHE_CANT_MOUNT;
>  		dentry_unlink_inode(dentry);
>  	} else {
> diff --git a/fs/open.c b/fs/open.c
> index 22adbef7ecc2..3441a004a841 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -1428,7 +1428,14 @@ static long do_sys_openat2(int dfd, const char __user *filename,
>  long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
>  {
>  	struct open_how how = build_open_how(flags, mode);
> -	return do_sys_openat2(dfd, filename, &how);
> +	long err;
> +
> +	if (flags & O_NODENTRY)
> +		current->flags |= PF_REMOVE_DENTRY;
> +	err = do_sys_openat2(dfd, filename, &how);
> +	if (flags & O_NODENTRY)
> +		current->flags &= ~PF_REMOVE_DENTRY;
> +	return err;
>  }
>  
>  
> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
> index bff956f7b2b9..82ba79bc0072 100644
> --- a/include/linux/dcache.h
> +++ b/include/linux/dcache.h
> @@ -215,7 +215,7 @@ struct dentry_operations {
>  
>  #define DCACHE_NOKEY_NAME		BIT(25) /* Encrypted name encoded without key */
>  #define DCACHE_OP_REAL			BIT(26)
> -
> +#define DCACHE_FILE_REMOVE		BIT(27) /* remove this dentry when file is removed */
>  #define DCACHE_PAR_LOOKUP		BIT(28) /* being looked up (with parent locked shared) */
>  #define DCACHE_DENTRY_CURSOR		BIT(29)
>  #define DCACHE_NORCU			BIT(30) /* No RCU delay for freeing */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index f8d150343d42..f931a3a882e0 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1649,7 +1649,7 @@ extern struct pid *cad_pid;
>  #define PF_USED_MATH		0x00002000	/* If unset the fpu must be initialized before use */
>  #define PF_USER_WORKER		0x00004000	/* Kernel thread cloned from userspace thread */
>  #define PF_NOFREEZE		0x00008000	/* This thread should not be frozen */
> -#define PF__HOLE__00010000	0x00010000
> +#define PF_REMOVE_DENTRY	0x00010000      /* Remove the dentry when the file is removed */
>  #define PF_KSWAPD		0x00020000	/* I am kswapd */
>  #define PF_MEMALLOC_NOFS	0x00040000	/* All allocations inherit GFP_NOFS. See memalloc_nfs_save() */
>  #define PF_MEMALLOC_NOIO	0x00080000	/* All allocations inherit GFP_NOIO. See memalloc_noio_save() */
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 80f37a0d40d7..ca5f402d5e7d 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -89,6 +89,10 @@
>  #define __O_TMPFILE	020000000
>  #endif
>  
> +#ifndef O_NODENTRY
> +#define O_NODENTRY     040000000
> +#endif
> +
>  /* a horrid kludge trying to make sure that this will fail on old kernels */
>  #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
>  
> -- 
> 2.43.5
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux