Re: [Xen-devel] Xen inside KVM on AMD: Linux HVM/PVH crashes on AP bring up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 16, 2018 at 05:14:03PM +0200, Marek Marczykowski-Górecki wrote:
> Hi,
> 
> I' trying to boot Linux PVH on Xen, which is running inside KVM on AMD
> hardware. As soon as secondary CPU is starting, domain crashes.
> Strangely, without printing any related messages on the console. The
> last message is "x86: Booting SMP configuration:".
> This happens for both PVH and HVM with 2 vcpus. PVH/HVM domains with 1
> vcpu works fine(*), as well as PV domains with multiple vcpus.
> 
> Using gdbsx I've managed to get the point where it crashes:
> 
>     (gdb) f 12
>     #12 0xffffffff81025101 in do_error_trap (regs=0xffffc9000037fe78, error_code=-2401053088876204019, 
>         str=0x40 <irq_stack_union+64> <error: Cannot access memory at address 0x40>, trapnr=6, signr=-2)
>         at arch/x86/kernel/traps.c:302
>     302	arch/x86/kernel/traps.c: No such file or directory.
>     (gdb) p/x *regs
>     $8 = {r15 = 0x0, r14 = 0x0, r13 = 0x0, r12 = 0x0, bp = 0x1, bx = 0xffff88007fd0f040, r11 = 0x0, 
>       r10 = 0x0, r9 = 0x38, r8 = 0x0, ax = 0xffffffe4, cx = 0xffffffff82251e68, dx = 0x0, si = 0x96, 
>       di = 0x82, orig_ax = 0xffffffffffffffff, ip = 0xffffffff81036bd3, cs = 0x10, flags = 0x10086, 
>       sp = 0xffffc9000037ff20, ss = 0x0}
>     (gdb) info symbol 0xffffffff81036bd3
>     identify_secondary_cpu + 83 in section .text
> 
> It is BUG_ON(c == &boot_cpu_data). If I read it correctly, "c" is 0x82,
> which indeed isn't &boot_cpu_data (0xffffffff8234fe00).
> 
> Any idea?
>
> Version info:
> Linux (L0, KVM): 4.4.114-42 (OpenSUSE Leap 42.3)
> Xen (L1): 4.8.3
> Linux dom0 (L1): 4.14.18
> Linux guest: 4.14.18

Upgrading L0 kernel to 4.16.8 and guest (L2) kernel to 4.15.6 fixed this
problem. Not sure if L0 kernel upgrade was necessary (on its own didn't
helped), but the latter one definitely was.

> (*) besides some 20s+ delay on flush_work in deferred_probe_initcall,
> before actually calling deferred_probe_work_func.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux