Re: [PATCH v2] hung_task : check the value of "sysctl_hung_task_timeout_sec"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



于 2014/3/26 0:25, Satoru Takeuchi 写道:
> At Tue, 25 Mar 2014 16:58:58 +0800,
> Liu hua wrote:
>>
>> 于 2014/3/24 4:50, Satoru Takeuchi 写道:
>>> At Sun, 23 Mar 2014 15:54:04 +0800,
>>> Liu Hua wrote:
>>>>
>>>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
>>>> larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
>>>> watchdog will return immediately without sleep and with print :
>>>>
>>>> [  205.452934] schedule_timeout: wrong timeout value ffffffffffffff83
>>>>
>>>> and then the funtion watchdog will call schedule_timeout_interruptible again
>>>> and again. The screen will be filled with
>>>> 	"schedule_timeout: wrong timeout value ffffffffffffff83"
>>>>
>>>> This patch does some check and correction in timeout_jiffies, to let the
>>>> function schedule_timeout_interruptible allways get the valid parameter.
>>>>
>>>> Cc: <stable@xxxxxxxxxxxxxxx>
>>>> Signed-off-by: Liu Hua <sdu.liu@xxxxxxxxxx>
>>>> ---
>>>>  kernel/hung_task.c | 8 ++++++--
>>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>>>> index 6df6149..f992286 100644
>>>> --- a/kernel/hung_task.c
>>>> +++ b/kernel/hung_task.c
>>>> @@ -174,8 +174,12 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>>>>  
>>>>  static unsigned long timeout_jiffies(unsigned long timeout)
>>>>  {
>>>> -	/* timeout of 0 will disable the watchdog */
>>>> -	return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>>>> +	/* timeout of 0 or >= LONG_MAX/HZ will disable the watchdog */
>>>> +	if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT))
>>>
>>> You should check whether sysctl_hung_task_timeout_sec > MAX_SCHEDULE_TIMEOUT/HZ
>>> or not when setting this parameter instead. Then this check ins't necessary here.
>>>
>>> # Just FYI, MAX_SCHEDULE_TIMEOUT should be MAX_SCHEDULE_TIMEOUT/HZ here. 
>>>
>>> Thanks,
>>> Satoru
>>
>>  Yes, how about this :
> 
> I confirmed the followings.
> 
>  - 3.14-rc8: system hunged up with "hung_task_timeout_secs > LONG_MAX/HZ".
>  - 3.14-rc8 with your patch: works fine. I can't set the above mentioned value any more.
> 
> Writing possible values (0..LONG_MAX/HZ) in Documentation/sysctl/kernel.txt
> make this patch better.
> 
> Thanks,
> Satoru

Thanks to you attention and suggestion. I remade this patch as following.
Is it appropriate to be reposted with tag "PATCH v3"

Subject: [PATCH v3] hung_task : check the value of "sysctl_hung_task_timeout_sec"

As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep and with print :

[  205.452934] schedule_timeout: wrong timeout value ffffffffffffff83

and then the funtion watchdog will call schedule_timeout_interruptible
again and again. The screen will be filled with
	"schedule_timeout: wrong timeout value ffffffffffffff83"

This patch does some check and correction in sysctl, to let the
function schedule_timeout_interruptible allways get the valid parameter.

Signed-off-by: Liu Hua <sdu.liu@xxxxxxxxxx>
Tested-by: Satoru Takeuchi <satoru.takeuchi@xxxxxxxxx>
---
 Documentation/sysctl/kernel.txt | 1 +
 kernel/sysctl.c                 | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index e55124e..855d9b3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -317,6 +317,7 @@ for more than this value report a warning.
 This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.

 0: means infinite timeout - no checking done.
+Possible values to set are in range {0..LONG_MAX/HZ}.

 ==============================================================

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 49e13e1..aae21e8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;

+/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */
+#ifdef CONFIG_DETECT_HUNG_TASK
+static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
+#endif
+
 #ifdef CONFIG_INOTIFY_USER
 #include <linux/inotify.h>
 #endif
@@ -995,6 +1000,7 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
 		.proc_handler	= proc_dohung_task_timeout_secs,
+		.extra2		= &hung_task_timeout_max,
 	},
 	{
 		.procname	= "hung_task_warnings",
-- 
1.9.0

Thanks,
Liu Hua

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]