Re: [RFC][PATCH] mm, page_alloc: Warn on !__GFP_NOWARN allocation from IRQ context.

Johannes Weiner <hannes@xxxxxxxxxxx> · Tue, 2 Feb 2016 11:14:21 -0500

On Tue, Feb 02, 2016 at 10:33:22PM +0900, Tetsuo Handa wrote:
> >From 20b3c1c9ef35547395c3774c6208a867cf0046d4 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Tue, 2 Feb 2016 16:50:45 +0900
> Subject: [RFC][PATCH] mm, page_alloc: Warn on !__GFP_NOWARN allocation from IRQ context.
> 
> Jan Stancek hit a hard lockup problem due to flood of memory allocation
> failure messages which lasted for 10 seconds with IRQ disabled. Printing
> traces using warn_alloc_failed() is very slow (which can take up to about
> 1 second for each warn_alloc_failed() call). The caller used GFP_NOWARN
> inside a loop. If the caller used __GFP_NOWARN, it would not have lasted
> for 10 seconds.

Who is doing page allocations in a loop with irqs disabled?!

And then, why does it take that long? Is that a serial console? Most
of the output is KERN_INFO, it might be better to raise the loglevel
and still have all the debugging output in the logs.

If that's not enough, we could consider changing the ratelimit or make
should_suppress_show_mem() filter interrupts regardless of NODES_SHIFT.

Or ratelimit show_mem() in a different way than the single page alloc
failure line. It's not that the state changes significantly while an
avalanche of allocations are failing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>