On Fri, 2021-12-17 at 14:55 -0600, Tom Lendacky wrote: > On 12/17/21 2:13 PM, David Woodhouse wrote: > > On Fri, 2021-12-17 at 13:46 -0600, Tom Lendacky wrote: > > > There's no WARN or PANIC, just a reset. I can look to try and capture some > > > KVM trace data if that would help. If so, let me know what events you'd > > > like captured. > > > > > > Could start with just kvm_run_exit? > > > > Reason 8 would be KVM_EXIT_SHUTDOWN and would potentially indicate a > > triple fault. > > qemu-system-x86-24093 [005] ..... 1601.759486: kvm_exit: vcpu 112 reason shutdown rip 0xffffffff81070574 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x80000b08 error_code 0x00000000 > > # addr2line -e woodhouse-build-x86_64/vmlinux 0xffffffff81070574 > /root/kernels/woodhouse-build-x86_64/./arch/x86/include/asm/desc.h:272 > > Which is: asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); So, I remain utterly bemused by this, and the Milan *guests* I have access to can't even kexec with a stock kernel; that is also "too fast" and they take a triple fault during the bringup in much the same way — even without my parallel patches, and even going back to fairly old kernels. I wasn't able to follow up with raw serial output during the bringup to pinpoint precisely where it happens, because the VM would tear itself down in response to the triple fault without actually flushing the last virtual serial output :) It would be really useful to get access to a suitable host where I can spawn this in qemu and watch it fail. I am suspecting a chip-specific quirk or bug at this point. I might suggest in the short term that we could unblock the parallel bringup work by just not doing it for affected chips... but that won't make existing kexec work.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature