When kernel oops happens in some kernel thread, i.e. kcompactd in the test, the below bug might be triggered by the oops handler: BUG: sleeping function called from invalid context at include/linux/sched.h:2858 in_atomic(): 0, irqs_disabled(): 1, pid: 110, name: kcompactd0 CPU: 6 PID: 110 Comm: kcompactd0 Tainted: G D 4.6.0-rc4-next-20160420 #4 Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.10.0025.030220091519 03/02/2009 0000000000000000 ffff88036173f9e8 ffffffff8152666f 0000000000000000 ffff880361732680 ffff88036173fa08 ffffffff81088b13 ffffffff81ee3372 0000000000000b2a ffff88036173fa30 ffffffff81088bd9 ffff880361732680 Call Trace: [<ffffffff8152666f>] dump_stack+0x67/0x98 [<ffffffff81088b13>] ___might_sleep+0x123/0x1a0 [<ffffffff81088bd9>] __might_sleep+0x49/0x80 [<ffffffff810706b4>] exit_signals+0x24/0x130 [<ffffffff81063cc4>] do_exit+0xc4/0xca0 [<ffffffff810201d9>] oops_end+0x89/0xc0 [<ffffffff810518c4>] no_context+0x144/0x390 [<ffffffff81542f17>] ? debug_smp_processor_id+0x17/0x20 [<ffffffff81051c1d>] __bad_area_nosemaphore+0x10d/0x230 [<ffffffff811769e9>] ? free_hot_cold_page_list+0x49/0xd0 [<ffffffff81051d54>] bad_area_nosemaphore+0x14/0x20 [<ffffffff81051f97>] __do_page_fault+0x237/0x570 [<ffffffff810522f9>] do_page_fault+0x29/0x80 [<ffffffff81be7b22>] page_fault+0x22/0x30 [<ffffffff8119d2f8>] ? release_freepages+0x18/0xa0 [<ffffffff8119f13d>] compact_zone+0x55d/0x9f0 [<ffffffff81196239>] ? fragmentation_index+0x19/0x70 [<ffffffff8119f92f>] kcompactd_do_work+0x10f/0x230 [<ffffffff8119fae0>] kcompactd+0x90/0x1e0 [<ffffffff810a3a40>] ? wait_woken+0xa0/0xa0 [<ffffffff8119fa50>] ? kcompactd_do_work+0x230/0x230 [<ffffffff810801ed>] kthread+0xdd/0x100 [<ffffffff81be5ee2>] ret_from_fork+0x22/0x40 [<ffffffff81080110>] ? kthread_create_on_node+0x180/0x180 Since the code path may be called in interrupt disabled context, so the might_sleep in threadgroup_change_begin() may be triggered. Before calling exit_signals(), it already checked if it is in hard IRQ handler, so it sounds safe to reenable interrupt at that point. Signed-off-by: Yang Shi <yang.shi@xxxxxxxxxx> --- kernel/exit.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/exit.c b/kernel/exit.c index 9e6e135..c6f8e37 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -679,6 +679,14 @@ void do_exit(long code) validate_creds_for_do_exit(tsk); /* + * It is possible to get here with interrupt disabled when fault + * happens in kernel thread. Enable interrupt to make threadgroup + * happy. + */ + if (irqs_disabled()) + local_irq_enable(); + + /* * We're taking recursive faults here in do_exit. Safest is to just * leave this task alone and wait for reboot. */ -- 2.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html