Hi Matt, On Tue, 20 Jan 2015 19:02:38 +0000 Matt Fleming wrote: > On Fri, 16 Jan, at 11:03:44AM, Bruno Prémont wrote: > > Register dump: > > rax 0x1000 4096 > > rbx 0x23f78cb 37714123 > > rcx 0x0 0 > > rdx 0x0 0 > > rsi 0x0 0 > > rdi 0x23f7863 37714019 > > rbp 0x1a363b4 0x1a363b4 > > rsp 0x2404b20 0x2404b20 > > r8 0x2404ee0 37768928 > > r9 0x4 4 > > r10 0x3 3 > > r11 0x9 9 > > r12 0x13dcbbc 20827068 > > r13 0x1e000000 503316480 (this seems to point to decompressed kernel) > > [...] > > > while on the failing one I get (just enough efi_printk to cause kernel to boot): > > [ 0.000000] efi: EFI v2.30 by VMware, Inc. > > [ 0.000000] efi: SMBIOS=0x1ffaf000 ACPI 2.0=0x1ff9f000 > > [ 0.000000] efi: mem00: [ACPI Memory NVS | | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x0000000000001000) (0MB) > > [..] > > > [ 0.000000] efi: mem23: [Boot Data | | | | | |WB|WT|WC|UC] range=[0x000000001dee8000-0x000000001e547000) (6MB) > > Oops. It sure looks like the EFI boot stub is trashing an EFI boot data > region. That would certainly explain the memory corruption you're seeing > (since the firmware assumes no one else is touch its data areas). Interestingly that part of the memory map has not changed though. Though I'm wondering why bzImage is (also) being corrupted in mem04. I've not checked yet how far decompression got (nor if it matches the start of corruption in bzImage). > By any chance have you modified CONFIG_PHYSICAL_START in your .config? I've not touched it, so it has default value: CONFIG_PHYSICAL_START=0x1000000 > The suspect code is probably this from > arch/x86/boot/compressed/head_64.S: > > --- > > /* > * Compute the decompressed kernel start address. It is where > * we were loaded at aligned to a 2M boundary. %rbp contains the > * decompressed kernel start address. > * > * If it is a relocatable kernel then decompress and run the kernel > * from load address aligned to 2MB addr, otherwise decompress and > * run the kernel from LOAD_PHYSICAL_ADDR > * > * We cannot rely on the calculation done in 32-bit mode, since we > * may have been invoked via the 64-bit entry point. > */ > > /* Start with the delta to where the kernel will run at. */ > #ifdef CONFIG_RELOCATABLE > leaq startup_32(%rip) /* - $startup_32 */, %rbp > movl BP_kernel_alignment(%rsi), %eax > decl %eax > addq %rax, %rbp > notq %rax > andq %rax, %rbp > cmpq $LOAD_PHYSICAL_ADDR, %rbp > jge 1f > #endif > movq $LOAD_PHYSICAL_ADDR, %rbp > 1: > > You may want to snoop around this code to make sure that we're not > making some crazy calculation mistakes wrt where we decompress the > kernel. What's the best way to check this? I could at en endless loop just before that block and replay in gdb with coredump. Thanks, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html