On Thu, Oct 10, 2024 at 7:04 PM Jan Kara <jack@xxxxxxx> wrote: > > On Thu 10-10-24 09:35:46, Benjamin Coddington wrote: > > On 10 Oct 2024, at 8:16, Jan Kara wrote: > > > > > On Thu 10-10-24 19:25:42, Ye Bin wrote: > > >> From: Ye Bin <yebin10@xxxxxxxxxx> > > >> > > >> In order to better analyze the issue of file system uninstallation caused > > >> by kernel module opening files, it is necessary to perform dentry recycling > > > > > > I don't quite understand the use case you mention here. Can you explain it > > > a bit more (that being said I've needed dropping caches for a particular sb > > > myself a few times for debugging purposes so I generally agree it is a > > > useful feature). > > > > > >> on a single file system. But now, apart from global dentry recycling, it is > > >> not supported to do dentry recycling on a single file system separately. > > >> This feature has usage scenarios in problem localization scenarios.At the > > >> same time, it also provides users with a slightly fine-grained > > >> pagecache/entry recycling mechanism. > > >> This patch supports the recycling of pagecache/entry for individual file > > >> systems. > > >> > > >> Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx> > > >> --- > > >> fs/drop_caches.c | 43 +++++++++++++++++++++++++++++++++++++++++++ > > >> include/linux/mm.h | 2 ++ > > >> kernel/sysctl.c | 9 +++++++++ > > >> 3 files changed, 54 insertions(+) > > >> > > >> diff --git a/fs/drop_caches.c b/fs/drop_caches.c > > >> index d45ef541d848..99d412cf3e52 100644 > > >> --- a/fs/drop_caches.c > > >> +++ b/fs/drop_caches.c > > >> @@ -77,3 +77,46 @@ int drop_caches_sysctl_handler(const struct ctl_table *table, int write, > > >> } > > >> return 0; > > >> } > > >> + > > >> +int drop_fs_caches_sysctl_handler(const struct ctl_table *table, int write, > > >> + void *buffer, size_t *length, loff_t *ppos) > > >> +{ > > >> + unsigned int major, minor; > > >> + unsigned int ctl; > > >> + struct super_block *sb; > > >> + static int stfu; > > >> + > > >> + if (!write) > > >> + return 0; > > >> + > > >> + if (sscanf(buffer, "%u:%u:%u", &major, &minor, &ctl) != 3) > > >> + return -EINVAL; > > > > > > I think specifying bdev major & minor number is not a great interface these > > > days. In particular for filesystems which are not bdev based such as NFS. I > > > think specifying path to some file/dir in the filesystem is nicer and you > > > can easily resolve that to sb here as well. > > > > Slight disagreement here since NFS uses set_anon_super() and major:minor > > will work fine with it. > > OK, fair point, anon bdev numbers can be used. But filesystems using > get_tree_nodev() would still be problematic. > > > I'd prefer it actually since it avoids this > > interface having to do a pathwalk and make decisions about what's mounted > > where and in what namespace. > > I don't understand the problem here. We'd do user_path_at(AT_FDCWD, ..., > &path) and then take path.mnt->mnt_sb. That doesn't look terribly > complicated to me. Plus it naturally deals with issues like namespacing > etc. although they are not a huge issue here because the functionality > should be restricted to CAP_SYS_ADMIN anyway. > Both looking up bdev and looking up path from write() can make syzbot and lockdep very upset: https://lore.kernel.org/linux-fsdevel/00000000000098f75506153551a1@xxxxxxxxxx/ I thought Christian had a proposal for dropping cache per-sb API via fadvise() or something? Why use sysfs API for this and not fd to reference an sb? Thanks, Amir.