Thanks for looking into the issue! The Openstack compute node is running on a bare-metal server, so the VMs are not nested. On Fri, Apr 20, 2018 at 6:09 PM, Christopherson, Sean J <sean.j.christopherson@xxxxxxxxx> wrote: > On Wed, 2018-04-18, Allen Yu wrote: >> Hi, >> >> I have been tracing the source of a nasty bug that happens on my >> Openstack Liberty cluster recently, but in vain. >> >> Basically when I start a new instance, Openstack would report it as >> "Paused" immediately. When I login to the compute node and check the >> qemu logs, I saw KVM internal error. Suberror: 3. > > Any chance the failing VM is a nested VM? The failure signature you > are seeing is identical to a recently fixed bug that manifested when > injecting an APIC timer event to L2. > > https://patchwork.kernel.org/patch/9593073/ > >> /var/log/libvirt/qemu/instance.log >> 2018-04-18 22:51:49.503+0000: starting up >> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin >> QEMU_AUDIO_DRV=none /usr/bin/kvm -name instance-000000fc -S -machine >> pc-i440fx-trusty,accel=kvm,usb=off -cpu >> SandyBridge,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+dca,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme >> -m 245000 -realtime mlock=off -smp 46,sockets=46,cores=1,threads=1 >> -uuid d6bec617-7977-4fd6-a6ad-cd9757735fdc -smbios >> type=1,manufacturer=OpenStack Foundation,product=OpenStack >> Nova,version=12.0.5,serial=d8eca30f-2e73-4dfb-9270-244084458637,uuid=d6bec617-7977-4fd6-a6ad-cd9757735fdc,family=Virtual >> Machine -no-user-config -nodefaults -chardev >> socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000000fc.monitor,server,nowait >> -mon chardev=charmonitor,id=monitor,mode=control -rtc >> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard >> -no-hpet -no-shutdown -boot strict=on -device >> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive >> file=/var/lib/nova/instances/d6bec617-7977-4fd6-a6ad-cd9757735fdc/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none >> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 >> -drive file=/var/lib/nova/instances/d6bec617-7977-4fd6-a6ad-cd9757735fdc/disk.swap,if=none,id=drive-virtio-disk1,format=qcow2,cache=none >> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 >> -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device >> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:42:9d:6f,bus=pci.0,addr=0x3 >> -chardev file,id=charserial0,path=/var/lib/nova/instances/d6bec617-7977-4fd6-a6ad-cd9757735fdc/console.log >> -device isa-serial,chardev=charserial0,id=serial0 -chardev >> pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 >> -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -k en-us -device >> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device >> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on >> Domain id=6 is tainted: high-privileges >> char device redirected to /dev/pts/7 (label charserial1) >> KVM internal error. Suberror: 3 >> extra data[0]: 800000ef >> extra data[1]: 31 >> RAX=0000000000000000 RBX=ffff883a8fee5fd8 RCX=00000000ffffffff >> RDX=0000000000000000 >> RSI=0000000000000001 RDI=ffffffff81dd9e48 RBP=ffff883a8fee5ec8 >> RSP=ffff883a8fee5ec8 >> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 >> R11=0000000000000000 >> R12=ffffffff81cdbe60 R13=000000000000000a R14=0000000000000000 >> R15=0000000000000000 >> RIP=ffffffff8103cf6b RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 >> ES =0000 0000000000000000 ffffffff 00c00000 >> CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] >> SS =0000 0000000000000000 ffffffff 00c00000 >> DS =0000 0000000000000000 ffffffff 00c00000 >> FS =0000 0000000000000000 ffffffff 00c00000 >> GS =0000 ffff883b20340000 ffffffff 00c00000 >> LDT=0000 0000000000000000 ffffffff 00c00000 >> TR =0040 ffff883b203512c0 00002087 00008b00 DPL=0 TSS64-busy >> GDT= ffff883b20344000 0000007f >> IDT= ffffffff81dd6000 00000fff >> CR0=8005003b CR2=00000000ffffffff CR3=0000000001c05000 CR4=001406e0 >> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 >> DR3=0000000000000000 >> DR6=00000000ffff0ff0 DR7=0000000000000400 >> EFER=0000000000000d01 >> Code=66 90 fb 5d c3 0f 1f 40 00 55 48 89 e5 66 66 66 66 90 fb f4 <5d> >> c3 0f 1f 00 55 48 89 e5 66 66 66 66 90 f4 5d c3 0f 1f 40 00 55 48 89 >> e5 66 66 66 66 90 >> >> Next I enabled kernel tracing according to the instructions on >> https://www.linux-kvm.org/page/Tracing. I noted a lot of page faults >> and IO errors. Here shows an excerpt: >> qemu-system-x86-4848 [021] 561909.361044: kvm_update_master_clock: >> masterclock 0 hostclock tsc offsetmatched 0 >> qemu-system-x86-4848 [021] 561909.361148: kvm_fpu: load >> qemu-system-x86-4848 [021] 561909.361151: kvm_entry: vcpu 0 >> qemu-system-x86-4848 [021] 561909.361155: kvm_exit: >> reason EPT_VIOLATION rip 0xfff0 info 184 0 >> qemu-system-x86-4848 [021] 561909.361157: kvm_page_fault: >> address fffffff0 error_code 184 >> qemu-system-x86-4848 [021] 561909.361167: kvm_entry: vcpu 0 >> qemu-system-x86-4848 [021] 561909.361168: kvm_exit: >> reason EPT_VIOLATION rip 0xe05b info 184 0 >> qemu-system-x86-4848 [021] 561909.361168: kvm_page_fault: >> address fe05b error_code 184 >> qemu-system-x86-4848 [021] 561909.361171: kvm_entry: vcpu 0 >> qemu-system-x86-4848 [021] 561909.361172: kvm_exit: >> reason EPT_VIOLATION rip 0xe05b info 181 0 >> qemu-system-x86-4848 [021] 561909.361172: kvm_page_fault: >> address f6574 error_code 181 >> >> The full tracing report is also available at >> https://www.dropbox.com/s/hmee8sr0zcruqyh/trace-cmd.report.gz?dl=0 >> >> Other system info: >> OS: Ubuntu 14.04.5 >> Kernel: 3.13.0-123-generic >> QEMU: version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.40) >> CPU: 2 x Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz VT-d enabled >> Motherboard: X10DRT-PT Intel C610 chipset >> BIOS: American Megatrends Inc. version: 1.0c date: 04/10/2015 >> RAM: 16 x 16GB Samsung M393A2G40DB0-CPB >> >> Any comments or hints would be greatly appreciated. Thank you very much! >> >> Best regards, >> Allen >>