Re: [PATCH] mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 29-01-16 19:49:12, Tetsuo Handa wrote:
> Jan Stancek has reported that system occasionally hanging after
> "oom01" testcase from LTP triggers OOM. Guessing from a result that
> there is a kworker thread doing memory allocation and the values
> between "Node 0 Normal free:" and "Node 0 Normal:" differs when
> hanging, vmstat is not up-to-date for some reason.
> 
> According to commit 373ccbe59270 ("mm, vmstat: allow WQ concurrency to
> discover memory reclaim doesn't make any progress"), it meant to force
> the kworker thread to take a short sleep, but it by error used
> schedule_timeout(1). We missed that schedule_timeout() in state
> TASK_RUNNING doesn't do anything.

Dang... You are right of course. I've made the same mistake during
oom_reaper development but didn't realize that the same has been used
for the WQ thingy. My bad!

I am not sure this really fixes the issue mentioned above because I
didn't get to look at the report yet but we definitely have to change
the task state before calling schedule_timeout so this is obviously
correct. I would just argue that the interruptible sleep or TASK_IDLE
would be little bit better. But it shouldn't really matter much with
such a short timeout I guess.

> Fix it by using schedule_timeout_uninterruptible(1) which forces
> the kworker thread to take a short sleep in order to make sure
> that vmstat is up-to-date.
> 
> Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
> Reported-by: Jan Stancek <jstancek@xxxxxxxxxx>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Cristopher Lameter <clameter@xxxxxxx>
> Cc: Joonsoo Kim <js1304@xxxxxxxxx>
> Cc: Arkadiusz Miskiewicz <arekm@xxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Thanks!
> ---
>  mm/backing-dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 7340353..cbe6f0b 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -989,7 +989,7 @@ long wait_iff_congested(struct zone *zone, int sync, long timeout)
>  		 * here rather than calling cond_resched().
>  		 */
>  		if (current->flags & PF_WQ_WORKER)
> -			schedule_timeout(1);
> +			schedule_timeout_uninterruptible(1);
>  		else
>  			cond_resched();
>  
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]