+ hung_task-allow-printing-warnings-every-check-interval.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Wed, 24 Jul 2019 17:03:26 -0700

The patch titled
     Subject: hung_task: allow printing warnings every check interval
has been added to the -mm tree.  Its filename is
     hung_task-allow-printing-warnings-every-check-interval.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/hung_task-allow-printing-warnings-every-check-interval.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/hung_task-allow-printing-warnings-every-check-interval.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Dmitry Safonov <dima@xxxxxxxxxx>
Subject: hung_task: allow printing warnings every check interval

Hung task detector has one timeout and has two associated actions on it:

- issuing warnings with names and stacks of blocked tasks
- panic()

We want switches to panic (and reboot) if there's a task in
uninterruptible sleep for some minutes - at that moment something ugly has
happened and the box needs a reboot.  But we also want to detect
conditions that are "out of range" or approaching the point of failure. 
Under such conditions we want to issue an "early warning" of an impending
failure, minutes before the switch is going to panic.

Those "early warnings" serve a purpose while monitoring the network
infrastructure.  Those are also valuable on post-mortem analysis, when the
logs from userspace applications aren't enough.  Furthermore, we have a
test pool of long-running duts that are constantly under close to
real-world load for weeks.  And such early warnings allowed to figure out
some bottle necks without much engineer work intervention.

There are also not yet upstream patches for other kinds of "early
warnings" as prints whenever a mutex/semaphore is released after being
held for long time, but those patches are much more intricate and have
their runtime cost.

It seems rather easy to add printing tasks and their stacks for
notification and debugging purposes into hung task detector without
complicating the code or major cost (prints are with KERN_INFO loglevel
and so don't go on console, only into dmesg log).

Since a2e514453861 ("kernel/hung_task.c: allow to set checking interval
separately from timeout") it's possible to set checking interval for hung
task detector with `hung_task_check_interval_secs`.

Provide `hung_task_interval_warnings` sysctl that allows printing hung
tasks every detection interval.  It's not ratelimited, so the root should
be cautious configuring it.

Link: http://lkml.kernel.org/r/20190724170249.9644-1-dima@xxxxxxxxxx
Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx>
Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Jonathan Corbet <corbet@xxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx>
Cc: Vasiliy Khoruzhick <vasilykh@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 Documentation/admin-guide/sysctl/kernel.rst |   20 ++++++-
 include/linux/sched/sysctl.h                |    1 
 kernel/hung_task.c                          |   50 ++++++++++++------
 kernel/sysctl.c                             |    8 ++
 4 files changed, 62 insertions(+), 17 deletions(-)

--- a/Documentation/admin-guide/sysctl/kernel.rst~hung_task-allow-printing-warnings-every-check-interval
+++ a/Documentation/admin-guide/sysctl/kernel.rst
@@ -45,6 +45,7 @@ show up in /proc/sys/kernel:
 - hung_task_timeout_secs
 - hung_task_check_interval_secs
 - hung_task_warnings
+- hung_task_interval_warnings
 - hyperv_record_panic_msg
 - kexec_load_disabled
 - kptr_restrict
@@ -383,14 +384,29 @@ Possible values to set are in range {0..
 hung_task_warnings:
 ===================
 
-The maximum number of warnings to report. During a check interval
-if a hung task is detected, this value is decreased by 1.
+The maximum number of warnings to report. If after timeout a hung
+task is present, this value is decreased by 1 every check interval,
+producing a warning.
 When this value reaches 0, no more warnings will be reported.
 This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 
 -1: report an infinite number of warnings.
 
 
+hung_task_interval_warnings:
+===================
+
+The same as hung_task_warnings, but set the number of interval
+warnings to be issued about detected hung tasks during check
+interval. That will produce warnings *before* the timeout happens.
+If a hung task is detected during check interval, this value is
+decreased by 1. When this value reaches 0, only timeout warnings
+will be reported.
+This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+
+-1: report an infinite number of check interval warnings.
+
+
 hyperv_record_panic_msg:
 ========================
 
--- a/include/linux/sched/sysctl.h~hung_task-allow-printing-warnings-every-check-interval
+++ a/include/linux/sched/sysctl.h
@@ -12,6 +12,7 @@ extern unsigned int  sysctl_hung_task_pa
 extern unsigned long sysctl_hung_task_timeout_secs;
 extern unsigned long sysctl_hung_task_check_interval_secs;
 extern int sysctl_hung_task_warnings;
+extern int sysctl_hung_task_interval_warnings;
 extern int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
 					 void __user *buffer,
 					 size_t *lenp, loff_t *ppos);
--- a/kernel/hung_task.c~hung_task-allow-printing-warnings-every-check-interval
+++ a/kernel/hung_task.c
@@ -49,6 +49,7 @@ unsigned long __read_mostly sysctl_hung_
 unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
 
 int __read_mostly sysctl_hung_task_warnings = 10;
+int __read_mostly sysctl_hung_task_interval_warnings;
 
 static int __read_mostly did_panic;
 static bool hung_task_show_lock;
@@ -85,6 +86,34 @@ static struct notifier_block panic_block
 	.notifier_call = hung_task_panic,
 };
 
+static void hung_task_warning(struct task_struct *t, bool timeout)
+{
+	const char *loglevel = timeout ? KERN_ERR : KERN_INFO;
+	const char *path;
+	int *warnings;
+
+	if (timeout) {
+		warnings = &sysctl_hung_task_warnings;
+		path = "hung_task_timeout_secs";
+	} else {
+		warnings = &sysctl_hung_task_interval_warnings;
+		path = "hung_task_interval_secs";
+	}
+
+	if (*warnings > 0)
+		--*warnings;
+
+	printk("%sINFO: task %s:%d blocked for more than %ld seconds.\n",
+	       loglevel, t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
+	printk("%s      %s %s %.*s\n",
+		loglevel, print_tainted(), init_utsname()->release,
+		(int)strcspn(init_utsname()->version, " "),
+		init_utsname()->version);
+	printk("%s\"echo 0 > /proc/sys/kernel/%s\" disables this message.\n",
+		loglevel, path);
+	sched_show_task(t);
+}
+
 static void check_hung_task(struct task_struct *t, unsigned long timeout)
 {
 	unsigned long switch_count = t->nvcsw + t->nivcsw;
@@ -109,6 +138,9 @@ static void check_hung_task(struct task_
 		t->last_switch_time = jiffies;
 		return;
 	}
+	if (sysctl_hung_task_interval_warnings)
+		hung_task_warning(t, false);
+
 	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
 		return;
 
@@ -120,22 +152,10 @@ static void check_hung_task(struct task_
 		hung_task_call_panic = true;
 	}
 
-	/*
-	 * Ok, the task did not get scheduled for more than 2 minutes,
-	 * complain:
-	 */
 	if (sysctl_hung_task_warnings) {
-		if (sysctl_hung_task_warnings > 0)
-			sysctl_hung_task_warnings--;
-		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
-		       t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
-		pr_err("      %s %s %.*s\n",
-			print_tainted(), init_utsname()->release,
-			(int)strcspn(init_utsname()->version, " "),
-			init_utsname()->version);
-		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
-			" disables this message.\n");
-		sched_show_task(t);
+		/* Don't print warings twice */
+		if (!sysctl_hung_task_interval_warnings)
+			hung_task_warning(t, true);
 		hung_task_show_lock = true;
 	}
 
--- a/kernel/sysctl.c~hung_task-allow-printing-warnings-every-check-interval
+++ a/kernel/sysctl.c
@@ -1147,6 +1147,14 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &neg_one,
 	},
+	{
+		.procname	= "hung_task_interval_warnings",
+		.data		= &sysctl_hung_task_interval_warnings,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &neg_one,
+	},
 #endif
 #ifdef CONFIG_RT_MUTEXES
 	{
_

Patches currently in -mm which might be from dima@xxxxxxxxxx are

hung_task-allow-printing-warnings-every-check-interval.patch