Re: [PATCH] mm,page_alloc: softlockup on warn_alloc on

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Sun, 24 Sep 2017 10:56:35 +0900

Tetsuo Handa wrote:
> Assuming that Wang Yu's trace has
> 
>   RIP: 0010:[<...>]  [<...>] dump_stack+0x.../0x...
> 
> line in the omitted part (like Cong Wang's trace did), I suspect that a thread
> which is holding dump_lock is unable to leave console_unlock() from printk() for
> so long because many other threads are trying to call printk() from warn_alloc()
> while consuming all CPU time.
> 
> Thus, not allowing other threads to consume CPU time / call printk() is a step for
> isolating it. If this problem still exists even if we made other threads sleep,
> the real cause will be somewhere else. But unfortunately Cong Wang has not yet
> succeeded with reproducing the problem. If Wang Yu is able to reproduce the problem,
> we can try setting 1 to /proc/sys/kernel/softlockup_all_cpu_backtrace so that
> we can know what other CPUs are doing.

It seems that Johannes needs more time for getting a test result from production
environment. Meanwhile, for use as a reference, Wang, do you have a chance to retry
your stress test with /proc/sys/kernel/softlockup_all_cpu_backtrace set to 1 ?
I don't have access to environments with many CPUs...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>