If a broken machine with issues raises an MCE irq event real early in the boot, it can try and wake the -rt specific handler thread (mce_notify_helper) before it exists. (It is created through a device_initcall that happens later in the boot.) When this happens, we see the irq, which calls the wake with a null pointer, which then panics the machine at boot. The race between the irq event and thread init is as follows: mce_notify_irq(); --> mce_notify_work(); --> wake_up_process(mce_notify_helper); device_initcall_sync(mcheck_init_device); --> mce_notify_work_init(); --> mce_notify_helper = kthread_run(mce_notify_helper_thread, ...); So, clearly if the IRQ event happens before the device_initcall, the mce_notify_helper pointer (at global file scope and hence BSS) will still be NULL, resulting in the following panic at boot: CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 ENERGY_PERF_BIAS: Set to 'normal', was 'performance' ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8) mce: CPU supports 22 MCE banks CPU0: Thermal monitoring enabled (TM1) Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0 tlb_flushall_shift: 6 Freeing SMP alternatives: 36k freed ACPI: Core revision 20130328 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8107730d>] wake_up_process+0xd/0x40 PGD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.40-rt40_preempt-rt #1 Hardware name: Insyde Grantley/Type2 - Board Product Name1, BIOS 05.04.07 04/21/2014 task: ffffffff81e14440 ti: ffffffff81e00000 task.ti: ffffffff81e00000 RIP: 0010:[<ffffffff8107730d>] [<ffffffff8107730d>] wake_up_process+0xd/0x40 RSP: 0000:ffff88107fc03f68 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000007ffefbff RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88107fc03f70 R08: 0000000000000002 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88103f03d100 R13: ffff880ff4e0c000 R14: ffff88107fc16f00 R15: ffff880ff4e0c000 FS: 0000000000000000(0000) GS:ffff88107fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001e0f000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: ffff88107fc0ccf0 ffff88107fc03f80 ffffffff8101f900 ffff88107fc03f98 ffffffff8102169d ffff88107fc0fab0 ffff88107fc03fa8 ffffffff81022051 ffffffff81e01d48 ffffffff819a8a9a ffffffff81e01bf8 <EOI> ffffffff81e01d48 Call Trace: <IRQ> [<ffffffff8101f900>] mce_notify_irq+0x30/0x40 [<ffffffff8102169d>] intel_threshold_interrupt+0xbd/0xe0 [<ffffffff81022051>] smp_threshold_interrupt+0x21/0x40 [<ffffffff819a8a9a>] threshold_interrupt+0x6a/0x70 <EOI> [<ffffffff8199c57c>] ? __slab_alloc.isra.48+0x39e/0x60c [<ffffffff814369d5>] ? acpi_ps_alloc_op+0x9a/0xa1 [<ffffffff811534a8>] ? kmem_cache_free+0xb8/0x2b0 [<ffffffff81152be4>] kmem_cache_alloc+0x234/0x2e0 [<ffffffff814369d5>] ? acpi_ps_alloc_op+0x9a/0xa1 [<ffffffff814369d5>] acpi_ps_alloc_op+0x9a/0xa1 [<ffffffff8143523f>] acpi_ps_get_next_arg+0xfe/0x3d3 [<ffffffff814357a4>] acpi_ps_parse_loop+0x290/0x560 [<ffffffff814364bc>] acpi_ps_parse_aml+0x98/0x28c [<ffffffff8143242c>] acpi_ns_one_complete_parse+0x104/0x124 [<ffffffff8143247f>] acpi_ns_parse_table+0x33/0x38 [<ffffffff81431e56>] acpi_ns_load_table+0x4a/0x8c [<ffffffff81439d6e>] acpi_load_tables+0xa2/0x176 [<ffffffff81f4dbf3>] acpi_early_init+0x70/0x100 [<ffffffff81f1c4e9>] ? check_bugs+0xe/0x2d [<ffffffff81f14df2>] start_kernel+0x387/0x3b5 [<ffffffff81f14874>] ? repair_env_string+0x5c/0x5c [<ffffffff81f145ad>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81f1467b>] x86_64_start_kernel+0xcc/0xcf Code: 8b 52 18 e9 9e fc ff ff 48 89 45 c0 e8 cd df 92 00 48 8b 45 c0 eb e5 0f 1f 80 00 00 00 00 e8 fb 04 93 00 55 48 89 e5 53 48 89 fb <48> 8b 07 a8 0c 75 12 48 89 df 31 d2 be 03 00 00 00 e8 ad fb ff RIP [<ffffffff8107730d>] wake_up_process+0xd/0x40 RSP <ffff88107fc03f68> CR2: 0000000000000000 ---[ end trace 0000000000000001 ]--- Kernel panic - not syncing: Fatal exception in interrupt Evidently the hardware has issues, but we can handle this more gracefully by ignoring the events that happen before the device_initcall has registered the mce handler thread. Signed-off-by: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index aaf4b9b94f38..94860c521fb8 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1391,6 +1391,11 @@ static int mce_notify_work_init(void) static void mce_notify_work(void) { + if (unlikely(!mce_notify_helper)) { + pr_info(HW_ERR "Machine check event before MCE init; ignored\n"); + return; + } + wake_up_process(mce_notify_helper); } #else -- 2.0.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html