On 2024-06-17 16:41:30 [-0700], Roman Kisel wrote: > Missing, failed, or corrupted core dumps might impede crash > investigations. To improve reliability of that process and consequently > the programs themselves, one needs to trace the path from producing > a core dumpfile to analyzing it. That path starts from the core dump file > written to the disk by the kernel or to the standard input of a user > mode helper program to which the kernel streams the coredump contents. > There are cases where the kernel will interrupt writing the core out or > produce a truncated/not-well-formed core dump. How much of this happened and how much of this is just "let me handle everything that could go wrong". The cases where it was interrupted without a hint probably deserve a note rather then leaving a half of coredump back. > Signed-off-by: Roman Kisel <romank@xxxxxxxxxxxxxxxxxxx> > diff --git a/fs/coredump.c b/fs/coredump.c > index a57a06b80f57..a7200c9024c6 100644 > --- a/fs/coredump.c > +++ b/fs/coredump.c > @@ -777,9 +807,18 @@ void do_coredump(const kernel_siginfo_t *siginfo) > } > file_end_write(cprm.file); > free_vma_snapshot(&cprm); > + } else { > + pr_err("Core dump to |%s has been interrupted\n", cn.corename); > + retval = -EAGAIN; > + goto fail; > } > + pr_info("Core dump to |%s: vma_count %d, vma_data_size %lu, written %lld bytes, pos %lld\n", > + cn.corename, cprm.vma_count, cprm.vma_data_size, cprm.written, cprm.pos); Probably too noisy in the default case. The offsets probably don't matter unless you debug. > if (ispipe && core_pipe_limit) > wait_for_dump_helpers(cprm.file); > + > + retval = 0; > + > close_fail: > if (cprm.file) > filp_close(cprm.file, NULL); > diff --git a/include/linux/coredump.h b/include/linux/coredump.h > index 0904ba010341..8b29be758a87 100644 > --- a/include/linux/coredump.h > +++ b/include/linux/coredump.h > @@ -42,9 +42,9 @@ extern int dump_emit(struct coredump_params *cprm, const void *addr, int nr); > extern int dump_align(struct coredump_params *cprm, int align); > int dump_user_range(struct coredump_params *cprm, unsigned long start, > unsigned long len); > -extern void do_coredump(const kernel_siginfo_t *siginfo); > +extern int do_coredump(const kernel_siginfo_t *siginfo); > #else > -static inline void do_coredump(const kernel_siginfo_t *siginfo) {} > +static inline int do_coredump(const kernel_siginfo_t *siginfo) {} This probably does not compile. > #endif > > #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL) > diff --git a/kernel/signal.c b/kernel/signal.c > index 1f9dd41c04be..f2ecf29a994d 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -2675,6 +2675,7 @@ bool get_signal(struct ksignal *ksig) > struct sighand_struct *sighand = current->sighand; > struct signal_struct *signal = current->signal; > int signr; > + int ret; > > clear_notify_signal(); > if (unlikely(task_work_pending(current))) > @@ -2891,7 +2892,9 @@ bool get_signal(struct ksignal *ksig) > * first and our do_group_exit call below will use > * that value and ignore the one we pass it. > */ > - do_coredump(&ksig->info); > + ret = do_coredump(&ksig->info); > + if (ret) > + pr_err("coredump has not been created, error %d\n", ret); So you preserve the error code just for one additional note. > } > > /* Sebastian