Tetsuo Handa wrote: > Assuming that Wang Yu's trace has > > RIP: 0010:[<...>] [<...>] dump_stack+0x.../0x... > > line in the omitted part (like Cong Wang's trace did), I suspect that a thread > which is holding dump_lock is unable to leave console_unlock() from printk() for > so long because many other threads are trying to call printk() from warn_alloc() > while consuming all CPU time. > > Thus, not allowing other threads to consume CPU time / call printk() is a step for > isolating it. If this problem still exists even if we made other threads sleep, > the real cause will be somewhere else. But unfortunately Cong Wang has not yet > succeeded with reproducing the problem. If Wang Yu is able to reproduce the problem, > we can try setting 1 to /proc/sys/kernel/softlockup_all_cpu_backtrace so that > we can know what other CPUs are doing. It seems that Johannes needs more time for getting a test result from production environment. Meanwhile, for use as a reference, Wang, do you have a chance to retry your stress test with /proc/sys/kernel/softlockup_all_cpu_backtrace set to 1 ? I don't have access to environments with many CPUs... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>