[BUG] Guest kernel divide error in kvm_unlock_kick

Chris Webb <chris@xxxxxxxxxxxx> · Mon, 8 Sep 2014 14:28:07 +0100

I've reported this bug before, which reliably crashes a guest kernel shortly
after boot, but have just reconfirmed that it is still present with Linux
3.16.2 guest and host kernels and Qemu 2.1.

Running a 3.16.2 x86-64 SMP guest kernel on qemu-2.1, with kvm enabled and
-cpu host on a 3.16.2 AMD Opteron host, I'm seeing a reliable kernel panic
from the guest shortly after boot. I think is happening in kvm_unlock_kick()
in the paravirt_ops code:

divide error: 0000 [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 743 Comm: syslogd Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: ffff88007c972580 ti: ffff88007cb7c000 task.ti: ffff88007cb7c000
RIP: 0010:[<ffffffff81037fe2>]  [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80
RSP: 0000:ffff88007fc03ec8  EFLAGS: 00010046
RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000003
RDX: 0000000000000003 RSI: ffffffff81a466a0 RDI: 0000000000000000
RBP: ffffffff81a466a0 R08: ffffffff81b98940 R09: 0000000000000246
R10: 0000000000000400 R11: 0000000000000000 R12: 00000000000000ea
R13: 0000000000000009 R14: 0000000000000002 R15: ffff88007fc0d300
FS:  00007f2a6473e700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000004a8240 CR3: 000000007ac75000 CR4: 00000000000406f0
Stack:
 ffffffff81a46400 0000000000000246 0000000000000001 ffffffff8168979d
 0000000000000282 ffffffff81110d97 0000000000000007 ffff88007cb7ffd8
 ffff88007c972580 000000004b0782e8 0000000000000002 ffffffff81a0b0c8
Call Trace:
 <IRQ> 
 [<ffffffff8168979d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffff81110d97>] ? rcu_process_callbacks+0x337/0x4f0
 [<ffffffff810cde2d>] ? __do_softirq+0xfd/0x210
 [<ffffffff810ce06e>] ? irq_exit+0x7e/0xa0
 [<ffffffff8103063b>] ? smp_apic_timer_interrupt+0x3b/0x50
 [<ffffffff8168b04d>] ? apic_timer_interrupt+0x6d/0x80
 <EOI> 
 [<ffffffff8114180b>] ? filemap_map_pages+0x17b/0x240
 [<ffffffff811418c0>] ? filemap_map_pages+0x230/0x240
 [<ffffffff811679e2>] ? do_read_fault.isra.70+0x2a2/0x320
 [<ffffffff811696cc>] ? handle_mm_fault+0x37c/0xd00
 [<ffffffff8103bb45>] ? __do_page_fault+0x185/0x4c0
 [<ffffffff8168b958>] ? async_page_fault+0x28/0x30
 [<ffffffff813b9610>] ? __put_user_4+0x20/0x30
 [<ffffffff8168b958>] ? async_page_fault+0x28/0x30
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 <0f> 01 c1 0f 1f 00 5b 5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80
 RSP <ffff88007fc03ec8>
---[ end trace be08885ac2c94c6a ]---
Kernel panic - not syncing: Fatal exception in interrupt

My host kernel config is http://cdw.me.uk/tmp/host-config.txt and the guest
config is http://cdw.me.uk/tmp/guest-config.txt with qemu command line:

 qemu-system-x86 -enable-kvm -cpu host -machine q35 -m 2048 -name $1 \
   -smp sockets=1,cores=4 -pidfile /run/$1.pid -runas nobody \
   -serial stdio -vga none -vnc none -kernel /boot/vmlinuz-guest \
   -append "console=ttyS0 root=/dev/vda" \
   -drive file=/dev/guest/$1,cache=none,format=raw,if=virtio \
   -device virtio-rng-pci \
   -device virtio-net-pci,netdev=nic,mac=$(< /sys/class/net/$1/address) \
   -netdev tap,id=nic,fd=3 3<>/dev/tap$(< /sys/class/net/$1/ifindex)

I can stop this crash by disabling CONFIG_PARAVIRT_SPINLOCKS in my guest
kernel, running with -cpu qemu64 instead of -cpu host, or running with -smp 1
instead of -smp 4. (Removing/changing the -machine q35 makes no difference.)

/proc/cpuinfo on the host has 8 of these:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 2
model name	: AMD Opteron(tm) Processor 6328
stepping	: 0
microcode	: 0x600081c
cpu MHz		: 3200.000
cache size	: 2048 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 32
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips	: 6399.70
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

and on the guest, has 4 of these:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 2
model name	: AMD Opteron(tm) Processor 6328
stepping	: 0
microcode	: 0x1000065
cpu MHz		: 3199.852
cache size	: 2048 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb lm rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw xop fma4 tbm arat npt nrip_save bmi1
bogomips	: 6399.70
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

Full dumps are at http://cdw.me.uk/tmp/host-cpuinfo.txt and
http://cdw.me.uk/tmp/guest-cpuinfo.txt respectively. I've also put the host
and guest dmesg output shortly after booting at

  http://cdw.me.uk/tmp/host-dmesg.txt
  http://cdw.me.uk/tmp/guest-dmesg.txt

I tried enabling CONFIG_PARAVIRT_DEBUG, but no extra information was
reported. These kernels are built with frame pointers and -O2 rather than
-Os to try to maximise useful debug info.

Any help would be extremely gratefully received: I'm at a complete loss as
to what to do next to debug this so I can start using less ancient kernel
and qemu versions!

Best wishes,

Chris.--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html