On 04/17/19 at 02:00pm, Kairui Song wrote: > On Wed, Apr 17, 2019 at 12:57 PM Dave Young <dyoung@xxxxxxxxxx> wrote: > > > > On 04/17/19 at 09:38am, Dave Young wrote: > > > On 04/16/19 at 03:22pm, Borislav Petkov wrote: > > > > On Tue, Apr 16, 2019 at 07:41:33PM +0800, Dave Young wrote: > > > > > On 04/16/19 at 11:52am, Borislav Petkov wrote: > > > > > > I'll queue the below in the next days if there are no more complaints: > > > > > > > > > > As for the kexec breakage, even with the V3 patch, kexec still hangs on > > > > > a Lenovo T420 laptop. Kairui also reproduced the problem. So can we > > > > > wait a few days see if we can make some progress to find the cause? > > > > > > > > How is applying this patch going to change anything? > > > > > > > > I was told that the breakage is there even without it... > > > > > > Without this patch, the bug happens in the efi_get_rsdp.. function, this > > > patch tries to fix that by adding kexec_get.. but the new introduced > > > kexec_* function does not work on some laptops, so it is not a 100% good > > > fix, I hoped we can get it working for all known issues. But if we can > > > not do it eg. within one week we can go with this version and leave the > > > laptop issue as a known issue. > > > > > > > Latest debugging status: > > > > Kexec boot works with commenting out some code like below, so the guid > > cmp (memcmp) caused a system reset), still need to find out why: > > > > diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c > > index d9f9abd63c68..13e7a23ae94c 100644 > > --- a/arch/x86/boot/compressed/acpi.c > > +++ b/arch/x86/boot/compressed/acpi.c > > @@ -95,10 +95,12 @@ __efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables, > > table = tbl->table; > > } > > > > +/* > > if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) > > rsdp_addr = table; > > else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) > > return table; > > +*/ > > } > > > > return rsdp_addr; > > @@ -291,9 +293,10 @@ acpi_physical_address get_rsdp_addr(void) > > if (!pa) > > pa = kexec_get_rsdp_addr(); > > > > +/* > > if (!pa) > > pa = efi_get_rsdp_addr(); > > - > > +*/ > > if (!pa) > > pa = bios_get_rsdp_addr(); > > > > > > Hi Dave, for this case I think it's just because GCC will found the > loop does nothing, and optimize out the whole loop in > __efi_get_rsdp_addr and will no longer read the actual nr_table value. > > I can fix the boot error on T420 with your patch, but if I add > anything, like a hardcode value assignment with the right value for > acpi_rsdp in the loop, it will reset the machine. But set acpi_rsdp > with a right initial value out side the loop works fine. > If the loop condition is false, then there should be no difference > between just comment out the line you mentioned and add an assignment. > Else it just assign the value multiple times, not very reasonable but > shouldn't fail. > > And, I inspected the generated ASM code also suggest the same thing. > So still, access the systab memory is the cause of the system reset on > certain machines. Makse sense, my previous debug also point to some systab accessing. Probably some early pg table mess up. Thanks Dave _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec