On 03/08/2012 01:41 PM, Eric W. Biederman wrote:
Fernando Luis Vázquez Cao<fernando@xxxxxxxxxxxxx> writes:Subject: [PATCH] boot: ignore early NMIs From: Fernando Luis Vazquez Cao<fernando@xxxxxxxxxxxxx> NMIs very early in the boot process are rarely critical (usually it just means that there was a spurious bit flip somewhere in the hardware, or that this is a kdump kernel and we received an NMI generated in the previous context), so the current behavior of halting the system when one occurs is probably a bit over the top. This patch changes the early IDT so that NMIs are ignored and the kernel can, hopefully, continue executing other code. Harsher measures (panic, etc) are defered to the final NMI handler, which can actually make an informed decision. This issue presented itself in our environment as seemingly random hangs in kdump. Signed-off-by: Fernando Luis Vazquez Cao<fernando@xxxxxxxxxxxxx> --- diff -urNp linux-3.3-rc6-orig/arch/x86/kernel/head64.c linux-3.3-rc6/arch/x86/kernel/head64.c --- linux-3.3-rc6-orig/arch/x86/kernel/head64.c 2012-03-07 15:49:01.834241787 +0900 +++ linux-3.3-rc6/arch/x86/kernel/head64.c 2012-03-07 18:39:03.173732875 +0900 @@ -71,7 +71,7 @@ void __init x86_64_start_kernel(char * r (__START_KERNEL& PGDIR_MASK))); BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses)<= MODULES_END); - /* clear bss before set_intr_gate with early_idt_handler */ + /* clear bss before set_intr_gate with early_idt_handlers */ clear_bss(); /* Make NULL pointers segfault */ @@ -79,13 +79,8 @@ void __init x86_64_start_kernel(char * r max_pfn_mapped = KERNEL_IMAGE_SIZE>> PAGE_SHIFT; - for (i = 0; i< NUM_EXCEPTION_VECTORS; i++) { -#ifdef CONFIG_EARLY_PRINTK + for (i = 0; i< NUM_EXCEPTION_VECTORS; i++) set_intr_gate(i,&early_idt_handlers[i]); -#else - set_intr_gate(i, early_idt_handler); -#endif - } load_idt((const struct desc_ptr *)&idt_descr); if (console_loglevel == 10) diff -urNp linux-3.3-rc6-orig/arch/x86/kernel/head_64.S linux-3.3-rc6/arch/x86/kernel/head_64.S --- linux-3.3-rc6-orig/arch/x86/kernel/head_64.S 2012-03-07 15:49:01.838241839 +0900 +++ linux-3.3-rc6/arch/x86/kernel/head_64.S 2012-03-07 18:41:21.811516876 +0900 @@ -270,18 +270,29 @@ bad_address: jmp bad_address .section ".init.text","ax" -#ifdef CONFIG_EARLY_PRINTK .globl early_idt_handlers early_idt_handlers: - i = 0 + vector = 0 .rept NUM_EXCEPTION_VECTORS - movl $i, %esi - jmp early_idt_handler - i = i + 1 + /* + * NMIs (vector 2) this early in the boot process are rarely critical + * (usually it just means that there was a spurious bit flip somewhere + * in the hardware, or that this is a kdump kernel and we received an + * NMI generated in the previous context), so we ignore them here and + * try to continue (see early_nmi_handler implementation below). + * Harsher measures (panic, etc) are defered to the final NMI handler, + * which can actually make an informed decision. + */ + .if vector == 2 + jmp early_nmi_handlerIs just a jump and not a move followed by a jump still 10 bytes? I hate to say it but I think this fails miserably for any exception after a nmi.
Thank you for the heads up! Actually, it was working for the exceptions after the nmi but with a corrupted esi (vector number). My original intention was to fill the empty space with nops but forgot to actually implement it... Sorry about that. Will fix for the next iteration.
I expect the simplest solution is to modify early_idt_handler to test for vector == 2.
That is precisely what I did on a previous version but that would involve using registers which need to be saved and restored and I wanted to avoid using the stack in the NMI path. We would also need to add a "pushq rsi " in early_idt_handlers which implies modifying "early_idt_handlers" definition in "segment.h". If you are OK with it I would like to go with the approach in the two patches I sent.
Doing something less brittle than:extern const char early_idt_handlers[NUM_EXCEPTION_VECTORS][10];in segment.h might be a good idea as well.
Yes, I agree. I will give it some thought. -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html