On Wed, Feb 9, 2022 at 3:39 PM Tong Zhang <ztong0001@xxxxxxxxx> wrote: > > On Wed, Feb 9, 2022 at 3:29 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > On Wed, Feb 09, 2022 at 03:15:53PM -0800, Tong Zhang wrote: > > > On Wed, Feb 9, 2022 at 2:58 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > > > > > This is the second attempt to move binfmt_misc sysctl to its > > > > own file. The issue with the first move was that we moved > > > > the binfmt_misc sysctls to the binfmt_misc module, but the > > > > way things work on some systems is that the binfmt_misc > > > > module will load if the sysctl is present. If we don't force > > > > the sysctl on, the module won't load. The proper thing to do > > > > is to register the sysctl if the module was built or the > > > > binfmt_misc code was built-in, we do this by using the helper > > > > IS_ENABLED() now. > > > > > > > > The rationale for the move: > > > > > > > > kernel/sysctl.c is a kitchen sink where everyone leaves their dirty > > > > dishes, this makes it very difficult to maintain. > > > > > > > > To help with this maintenance let's start by moving sysctls to places > > > > where they actually belong. The proc sysctl maintainers do not want to > > > > know what sysctl knobs you wish to add for your own piece of code, we > > > > just care about the core logic. > > > > > > > > This moves the binfmt_misc sysctl to its own file to help remove clutter > > > > from kernel/sysctl.c. > > > > > > > > Cc: Domenico Andreoli <domenico.andreoli@xxxxxxxxx> > > > > Cc: Tong Zhang <ztong0001@xxxxxxxxx> > > > > Reviewed-by: Tong Zhang <ztong0001@xxxxxxxxx> > > > > Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > > > --- > > > > > > > > Andrew, > > > > > > > > If we get tested-by from Domenico and Tong I think this is ready. > > > > > > > > Demenico, Tong, can you please test this patch? Linus' tree > > > > should already have all the prior work reverted as Domenico requested > > > > so this starts fresh. > > > > > > > > fs/file_table.c | 2 ++ > > > > kernel/sysctl.c | 13 ------------- > > > > 2 files changed, 2 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/fs/file_table.c b/fs/file_table.c > > > > index 57edef16dce4..4969021fa676 100644 > > > > --- a/fs/file_table.c > > > > +++ b/fs/file_table.c > > > > @@ -119,6 +119,8 @@ static struct ctl_table fs_stat_sysctls[] = { > > > > static int __init init_fs_stat_sysctls(void) > > > > { > > > > register_sysctl_init("fs", fs_stat_sysctls); > > > > + if (IS_ENABLED(CONFIG_BINFMT_MISC)) > > > > + register_sysctl_mount_point("fs/binfmt_misc"); ^^^^ I'm looking at this code again and we need to mark this return value in kmemleak to avoid a false positive. diff --git a/fs/file_table.c b/fs/file_table.c index 4969021fa676..7303aa33b3fd 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -27,6 +27,7 @@ #include <linux/task_work.h> #include <linux/ima.h> #include <linux/swap.h> +#include <linux/kmemleak.h> #include <linux/atomic.h> @@ -119,8 +120,10 @@ static struct ctl_table fs_stat_sysctls[] = { static int __init init_fs_stat_sysctls(void) { register_sysctl_init("fs", fs_stat_sysctls); - if (IS_ENABLED(CONFIG_BINFMT_MISC)) - register_sysctl_mount_point("fs/binfmt_misc"); + if (IS_ENABLED(CONFIG_BINFMT_MISC)) { + struct ctl_table_header *hdr = register_sysctl_mount_point("fs/binfmt_misc"); + kmemleak_not_leak(hdr); + } return 0; } fs_initcall(init_fs_stat_sysctls);