On Thu, May 02, 2024 at 02:59:20PM +0000, Allen Pais wrote: > Introduce the capability to dynamically configure the maximum file > note size for ELF core dumps via sysctl. This enhancement removes > the previous static limit of 4MB, allowing system administrators to > adjust the size based on system-specific requirements or constraints. > > - Remove hardcoded `MAX_FILE_NOTE_SIZE` from `fs/binfmt_elf.c`. > - Define `max_file_note_size` in `fs/coredump.c` with an initial value > set to 4MB. > - Declare `max_file_note_size` as an external variable in > `include/linux/coredump.h`. > - Add a new sysctl entry in `kernel/sysctl.c` to manage this setting > at runtime. > > $ sysctl -a | grep max_file_note_size > kernel.max_file_note_size = 4194304 > > $ sysctl -n kernel.max_file_note_size > 4194304 > > $echo 519304 > /proc/sys/kernel/max_file_note_size > > $sysctl -n kernel.max_file_note_size > 519304 The names and paths in the commit log need a refresh here, since they've changed. > > Why is this being done? > We have observed that during a crash when there are more than 65k mmaps > in memory, the existing fixed limit on the size of the ELF notes section > becomes a bottleneck. The notes section quickly reaches its capacity, > leading to incomplete memory segment information in the resulting coredump. > This truncation compromises the utility of the coredumps, as crucial > information about the memory state at the time of the crash might be > omitted. Thanks for adding this! > > Signed-off-by: Vijay Nag <nagvijay@xxxxxxxxxxxxx> > Signed-off-by: Allen Pais <apais@xxxxxxxxxxxxxxxxxxx> > > --- > Changes in v2: > - Move new sysctl to fs/coredump.c [Luis & Kees] > - rename max_file_note_size to core_file_note_size_max [kees] > - Capture "why this is being done?" int he commit message [Luis & Kees] > --- > fs/binfmt_elf.c | 3 +-- > fs/coredump.c | 10 ++++++++++ > include/linux/coredump.h | 1 + > 3 files changed, 12 insertions(+), 2 deletions(-) > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 5397b552fbeb..6aebd062b92b 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata, > fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata); > } > > -#define MAX_FILE_NOTE_SIZE (4*1024*1024) > /* > * Format of NT_FILE note: > * > @@ -1592,7 +1591,7 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm > > names_ofs = (2 + 3 * count) * sizeof(data[0]); > alloc: > - if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */ > + if (size >= core_file_note_size_max) /* paranoia check */ > return -EINVAL; I wonder, given the purpose of this sysctl, if it would be a discoverability improvement to include a pr_warn_once() before the EINVAL? Like: /* paranoia check */ if (size >= core_file_note_size_max) { pr_warn_once("coredump Note size too large: %zu (does kernel.core_file_note_size_max sysctl need adjustment?\n", size); return -EINVAL; } What do folks think? (I can't imagine tracking down this problem originally was much fun, for example.) > size = round_up(size, PAGE_SIZE); > /* > diff --git a/fs/coredump.c b/fs/coredump.c > index be6403b4b14b..a312be48030f 100644 > --- a/fs/coredump.c > +++ b/fs/coredump.c > @@ -56,10 +56,13 @@ > static bool dump_vma_snapshot(struct coredump_params *cprm); > static void free_vma_snapshot(struct coredump_params *cprm); > > +#define MAX_FILE_NOTE_SIZE (4*1024*1024) > + > static int core_uses_pid; > static unsigned int core_pipe_limit; > static char core_pattern[CORENAME_MAX_SIZE] = "core"; > static int core_name_size = CORENAME_MAX_SIZE; > +unsigned int core_file_note_size_max = MAX_FILE_NOTE_SIZE; > > struct core_name { > char *corename; > @@ -1020,6 +1023,13 @@ static struct ctl_table coredump_sysctls[] = { > .mode = 0644, > .proc_handler = proc_dointvec, > }, > + { > + .procname = "core_file_note_size_max", > + .data = &core_file_note_size_max, > + .maxlen = sizeof(unsigned int), > + .mode = 0644, > + .proc_handler = proc_douintvec, > + }, > }; > > static int __init init_fs_coredump_sysctls(void) > diff --git a/include/linux/coredump.h b/include/linux/coredump.h > index d3eba4360150..14c057643e7f 100644 > --- a/include/linux/coredump.h > +++ b/include/linux/coredump.h > @@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {} > #endif > > #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL) > +extern unsigned int core_file_note_size_max; > extern void validate_coredump_safety(void); > #else > static inline void validate_coredump_safety(void) {} > -- > 2.17.1 Otherwise, yes, this looks good to me. -- Kees Cook