The patch titled Subject: hung_task: add task->flags, blocked by coredump to log has been added to the -mm mm-nonmm-unstable branch. Its filename is hung_task-add-task-flags-blocked-by-coredump-to-log.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/hung_task-add-task-flags-blocked-by-coredump-to-log.patch This patch will later appear in the mm-nonmm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Oxana Kharitonova <oxana@xxxxxxxxxxxxxx> Subject: hung_task: add task->flags, blocked by coredump to log Date: Fri, 10 Jan 2025 16:03:28 +0000 Resending this patch as I haven't received feedback on my initial submission https://lore.kernel.org/all/20241204182953.10854-1-oxana@xxxxxxxxxxxxxx/ For the processes which are terminated abnormally the kernel can provide a coredump if enabled. When the coredump is performed, the process and all its threads are put into the D state (TASK_UNINTERRUPTIBLE | TASK_FREEZABLE). On the other hand, we have kernel thread khungtaskd which monitors the processes in the D state. If the task stuck in the D state more than kernel.hung_task_timeout_secs, the hung_task alert appears in the kernel log. The higher memory usage of a process, the longer it takes to create coredump, the longer tasks are in the D state. We have hung_task alerts for the processes with memory usage above 10Gb. Although, our kernel.hung_task_timeout_secs is 10 sec when the default is 120 sec. Adding additional information to the log that the task is blocked by coredump will help with monitoring. Another approach might be to completely filter out alerts for such tasks, but in that case we would lose transparency about what is putting pressure on some system resources, e.g. we saw an increase in I/O when coredump occurs due its writing to disk. Additionally, it would be helpful to have task_struct->flags in the log from the function sched_show_task(). Currently it prints task_struct->thread_info->flags, this seems misleading as the line starts with "task:xxxx". Link: https://lkml.kernel.org/r/20250110160328.64947-1-oxana@xxxxxxxxxxxxxx Signed-off-by: Oxana Kharitonova <oxana@xxxxxxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: Ben Segall <bsegall@xxxxxxxxxx> Cc: Christian Brauner <brauner@xxxxxxxxxx> Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> Cc: Valentin Schneider <vschneid@xxxxxxxxxx> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- kernel/hung_task.c | 2 ++ kernel/sched/core.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) --- a/kernel/hung_task.c~hung_task-add-task-flags-blocked-by-coredump-to-log +++ a/kernel/hung_task.c @@ -147,6 +147,8 @@ static void check_hung_task(struct task_ print_tainted(), init_utsname()->release, (int)strcspn(init_utsname()->version, " "), init_utsname()->version); + if (t->flags & PF_POSTCOREDUMP) + pr_err(" Blocked by coredump.\n"); pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\"" " disables this message.\n"); sched_show_task(t); --- a/kernel/sched/core.c~hung_task-add-task-flags-blocked-by-coredump-to-log +++ a/kernel/sched/core.c @@ -7701,9 +7701,9 @@ void sched_show_task(struct task_struct if (pid_alive(p)) ppid = task_pid_nr(rcu_dereference(p->real_parent)); rcu_read_unlock(); - pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d flags:0x%08lx\n", + pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d task_flags:0x%08lx flags:0x%08lx\n", free, task_pid_nr(p), task_tgid_nr(p), - ppid, read_task_thread_flags(p)); + ppid, p->flags, read_task_thread_flags(p)); print_worker_info(KERN_INFO, p); print_stop_info(KERN_INFO, p); _ Patches currently in -mm which might be from oxana@xxxxxxxxxxxxxx are hung_task-add-task-flags-blocked-by-coredump-to-log.patch