On Thu, 11 Aug 2016 13:10:57 +0800 Peter Xu <peterx@xxxxxxxxxx> wrote: > On Wed, Aug 10, 2016 at 10:51:51AM +0200, Igor Mammedov wrote: > > [...] > > > > > Upstream guest kernel 4.7.0+ (d52bd54db) crashes when booting with irq remapping on: > > > > > > > > ./qemu-system-x86_64 -enable-kvm -smp 1,sockets=9,cores=32,threads=1,maxcpus=288 -device qemu64-x86_64-cpu,socket-id=8,core-id=30,thread-id=0 -bios x2apic_bios.bin -m 1G -nographic -device intel-iommu,intremap=on -machine q35,kernel-irqchip=split -snapshot -global ioapic.version=0x20 /dev/rhel72 > > > > > > > > > > > > [ 0.350669] smpboot: Max logical packages: 9 > > > > [ 0.351853] smpboot: APIC(0) Converting physical 0 to logical package 0 > > > > [ 0.353160] smpboot: APIC(11e) Converting physical 8 to logical package 1 > > > > [ 0.354627] DMAR: Host address width 39 > > > > [ 0.355621] DMAR: DRHD base: 0x000000fed90000 flags: 0x1 > > > > [ 0.356785] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 12008c22260206 ecap f00f1a > > > > [ 0.358721] DMAR-IR: IOAPIC id 0 under DRHD base 0xfed90000 IOMMU 0 > > > > [ 0.360029] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping. > > > > [ 0.364605] DMAR-IR: Enabled IRQ remapping in x2apic mode > > > > [ 0.365805] BUG: unable to handle kernel NULL pointer dereference at (null) > > > > [ 0.367965] IP: [<ffffffff8105b025>] x2apic_cluster_probe+0x35/0x70 > > > > [ 0.369373] PGD 0 > > > > [ 0.370258] Oops: 0002 [#1] SMP > > > > [ 0.371140] Modules linked in: > > > > [ 0.372150] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0+ #647 > > > > [ 0.373485] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.0-143-gbac87e4 04/01/2014 > > > > [ 0.375622] task: ffff880039ad0000 task.stack: ffff880039ad8000 > > > > [ 0.376875] RIP: 0010:[<ffffffff8105b025>] [<ffffffff8105b025>] x2apic_cluster_probe+0x35/0x70 > > > > [ 0.379066] RSP: 0000:ffff880039adbe28 EFLAGS: 00010202 > > > > [ 0.380299] RAX: 0000000000000000 RBX: ffffffff81f388a8 RCX: ffff880039e00000 > > > > [ 0.381677] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000246 > > > > [ 0.383096] RBP: ffff880039adbe28 R08: 00000000000000c6 R09: ffff8800000b9b80 > > > > [ 0.384579] R10: 00000000000000a0 R11: 0000000000000050 R12: 0000000000002000 > > > > [ 0.385990] R13: 000000000000a118 R14: 000000000000011f R15: 0000000000000120 > > > > [ 0.387448] FS: 0000000000000000(0000) GS:ffff880039e00000(0000) knlGS:0000000000000000 > > > > [ 0.389454] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [ 0.390697] CR2: 0000000000000000 CR3: 0000000001c06000 CR4: 00000000000006f0 > > > > [ 0.392114] Stack: > > > > [ 0.392889] ffff880039adbe40 ffffffff81da277c 000000000000a110 ffff880039adbe78 > > > > [ 0.395135] ffffffff81d9c055 ffffffff81f14c60 ffff880039ad0a58 ffffffff81c95ac0 > > > > [ 0.397469] ffffffff818232c0 ffff880039ad0000 ffff880039adbf38 ffffffff81d86293 > > > > [ 0.399695] Call Trace: > > > > [ 0.400529] [<ffffffff81da277c>] default_setup_apic_routing+0x28/0x69 > > > > [ 0.401881] [<ffffffff81d9c055>] native_smp_prepare_cpus+0x223/0x2d2 > > > > [ 0.403260] [<ffffffff81d86293>] kernel_init_freeable+0xd8/0x249 > > > > [ 0.404525] [<ffffffff816d1b2e>] kernel_init+0xe/0x110 > > > > [ 0.405703] [<ffffffff816deb3f>] ret_from_fork+0x1f/0x40 > > > > [ 0.406966] [<ffffffff816d1b20>] ? rest_init+0x80/0x80 > > > > [ 0.408165] Code: 00 31 c0 65 8b 15 2c f1 fa 7e 85 c9 75 01 c3 48 63 ca 55 48 c7 c0 10 d7 00 00 48 8b 0c cd 20 8d d4 81 89 d2 48 89 e5 48 8b 04 08 <f0> 48 0f ab 10 49 c7 c0 60 b0 05 81 48 c7 c1 a0 ae 05 81 ba 01 > > > > [ 0.417107] RIP [<ffffffff8105b025>] x2apic_cluster_probe+0x35/0x70 > > > > [ 0.418516] RSP <ffff880039adbe28> > > > > [ 0.419461] CR2: 0000000000000000 > > > > [ 0.420386] ---[ end trace f68728a0d3053b52 ]--- > > I failed to reproduce this panic on my machine with parameter: > > bin=x86_64-softmmu/qemu-system-x86_64 > $bin -M q35,kernel-irqchip=split -enable-kvm -m 2048 \ > -monitor stdio -smp 4 \ > -device intel-iommu,intremap=on \ > -netdev user,id=net0,hostfwd=tcp::5555-:22 \ > -device e1000,netdev=net0 \ > -kernel /root/git/linux/arch/x86/boot/bzImage \ > -append root=/dev/sda3 \ > /root/images/rhel-7.2.qcow2 > > Guest kernel version is exactly 4.7.0+ (d52bd54db). In the guest, I > see x2apic enabled. Did I miss anything special? > you missed presence of x2apic CPU which this series enables, add/change CLI as following: -smp 1,sockets=9,cores=32,threads=1,maxcpus=288 \ -device qemu64-x86_64-cpu,socket-id=8,core-id=30,thread-id=0 + add x2apic_phys to kernel's command line PS: the last kernel I've tried is: v4.8-rc1-53-ga0cba21 + fix from Luiz > [...] > > > adding x2apic_phys to kernel's command line makes it crash but at another place: > > > > [ 0.364909] smpboot: Max logical packages: 9 > > [ 0.365838] smpboot: APIC(0) Converting physical 0 to logical package 0 > > [ 0.367183] smpboot: APIC(11e) Converting physical 8 to logical package 1 > > [ 0.370291] x2apic: IRQ remapping doesn't support X2APIC mode > > [ 0.371901] x2apic disabled > > Failed to understand why x2apic_phys will affect the system if x2apic > is disabled after all. it looks like despite printing "x2apic disabled" is still tries to access MSRs available only when CPU is in x2apic mode. > > Thanks! > > -- peterx > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html