On Thu, Mar 26, 2015 at 5:47 AM, Bandan Das <bsd@xxxxxxxxxx> wrote: > Hi Andrey, > > Andrey Korolyov <andrey@xxxxxxx> writes: > >> On Mon, Mar 16, 2015 at 10:17 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: >>> For now, it looks like bug have a mixed Murphy-Heisenberg nature, as >>> it appearance is very rare (compared to the number of actual launches) >>> and most probably bounded to the physical characteristics of my >>> production nodes. As soon as I reach any reproducible path for a >>> regular workstation environment, I`ll let everyone know. Also I am >>> starting to think that issue can belong to the particular motherboard >>> firmware revision, despite fact that the CPU microcode is the same >>> everywhere. > > I will take the risk and say this - "could it be a processor bug ?" :) > >> >> Hello everyone, I`ve managed to reproduce this issue >> *deterministically* with latest seabios with smp fix and 3.18.3. The >> error occuring just *once* per vm until hypervisor reboots, at least >> in my setup, this is definitely crazy... >> >> - launch two VMs (Centos 7 in my case), >> - wait a little while they are booting, >> - attach serial console (I am using virsh list for this exact purpose), >> - issue acpi reboot or reset, does not matter, >> - VM always hangs at boot, most times with sgabios initialization >> string printed out [1], but sometimes it hangs a bit later [2], >> - no matter how many times I try to relaunch the QEMU afterwards, the >> issue does not appear on VM which experienced problem once; >> - trace and sample args can be seen in [3] and [4] respectively. > > My system is a Dell R720 dual socket which has 2620v2s. I tried your > setup but couldn't reproduce (my qemu cmdline isn't exactly the same > as yours), although, if you could simplify your command line a bit, > I can try again. > > Bandan > >> 1) >> Google, Inc. >> Serial Graphics Adapter 06/11/14 >> SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ >> (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014 >> Term: 211x62 >> 4 0 >> >> 2) >> Google, Inc. >> Serial Graphics Adapter 06/11/14 >> SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ >> (pbuilder@zorak) Wed Jun 11 05:57:34 UTC 2014 >> Term: 211x62 >> 4 0 >> [...empty screen...] >> SeaBIOS (version 1.8.1-20150325_230423-testnode) >> Machine UUID 3c78721f-7317-4f85-bcbe-f5ad46d293a1 >> >> >> iPXE (http://ipxe.org) 00:02.0 C100 PCI2.10 PnP PMM+3FF95BA0+3FEF5BA0 C10 >> >> 3) >> >> KVM internal error. Suberror: 2 >> extra data[0]: 800000ef >> extra data[1]: 80000b0d >> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000 >> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00006d2c >> EIP=0000d331 EFL=00010202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 >> ES =0000 00000000 0000ffff 00009300 >> CS =f000 000f0000 0000ffff 00009b00 >> SS =0000 00000000 0000ffff 00009300 >> DS =0000 00000000 0000ffff 00009300 >> FS =0000 00000000 0000ffff 00009300 >> GS =0000 00000000 0000ffff 00009300 >> LDT=0000 00000000 0000ffff 00008200 >> TR =0000 00000000 0000ffff 00008b00 >> GDT= 000f6cb0 00000037 >> IDT= 00000000 000003ff >> CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000 >> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 >> DR3=0000000000000000 >> DR6=00000000ffff0ff0 DR7=0000000000000400 >> EFER=0000000000000000 >> Code=66 c3 cd 02 cb cd 10 cb cd 13 cb cd 15 cb cd 16 cb cd 18 cb <cd> >> 19 cb cd 1c cb cd 4a cb fa fc 66 ba 47 d3 0f 00 e9 ad fe f3 90 f0 0f >> ba 2d d4 fe fb 3f >> >> 4) >> /usr/bin/qemu-system-x86_64 -name centos71 -S -machine >> pc-i440fx-2.1,accel=kvm,usb=off -cpu SandyBridge,+kvm_pv_eoi -bios >> /usr/share/seabios/bios.bin -m 1024 -realtime mlock=off -smp >> 12,sockets=1,cores=12,threads=12 -uuid >> 3c78721f-7317-4f85-bcbe-f5ad46d293a1 -nographic -no-user-config >> -nodefaults -device sga -chardev >> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos71.monitor,server,nowait >> -mon chardev=charmonitor,id=monitor,mode=control -rtc >> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard >> -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global >> PIIX4_PM.disable_s4=1 -boot strict=on -device >> nec-usb-xhci,id=usb,bus=pci.0,addr=0x3 -device >> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive >> file=rbd:dev-rack2/centos7-1.raw:id=qemukvm:key=XXXXXXXXXXXXXXXXXXXXXXXXXX:auth_supported=cephx\;none:mon_host=10.6.0.1\:6789\;10.6.0.3\:6789\;10.6.0.4\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native >> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 >> -chardev pty,id=charserial0 -device >> isa-serial,chardev=charserial0,id=serial0 -chardev >> socket,id=charchannel0,path=/var/lib/libvirt/qemu/centos71.sock,server,nowait >> -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1 >> -msg timestamp=on Hehe, 2.2 works just perfectly but 2.1 isn`t. I`ll bisect the issue in a next couple of days and post the right commit (but as can remember none of commits b/w 2.1 and 2.2 can fix simular issue by a purpose). I`ve attached a reference xml to simplify playing with libvirt if anyone willing to do so.
<domain type='kvm' id='2' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <name>centos71</name> <uuid>3c78721f-7317-4f85-bcbe-f5ad46d293a1</uuid> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' cpuset='0-11'>12</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-2.1'>hvm</type> <loader>/usr/share/seabios/bios.bin</loader> <boot dev='hd'/> <bios useserial='yes'/> </os> <features> <acpi/> <apic eoi='on'/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>SandyBridge</model> <vendor>Intel</vendor> <topology sockets='1' cores='12' threads='12'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='writeback' io='native'/> <auth username='qemukvm'> <secret type='ceph' uuid='ec833836-71ae-f516-0666-7d2f6e123097'/> </auth> <source protocol='rbd' name='dev-rack2/centos7-1.raw'> <host name='10.6.0.1' port='6789'/> <host name='10.6.0.3' port='6789'/> <host name='10.6.0.4' port='6789'/> </source> <backingStore/> <target dev='vda' bus='virtio'/> <iotune> <read_bytes_sec>30000000</read_bytes_sec> <write_bytes_sec>15000000</write_bytes_sec> <read_iops_sec>100</read_iops_sec> <write_iops_sec>50</write_iops_sec> </iotune> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <controller type='usb' index='0' model='nec-xhci'> <alias name='usb0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <serial type='pty'> <source path='/dev/pts/1'/> <target type='isa-serial' port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/1'> <source path='/dev/pts/1'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/centos71.sock'/> <target type='virtio' name='org.qemu.guest_agent.1'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <memballoon model='none'> <alias name='balloon0'/> </memballoon> </devices> <seclabel type='none'/> <qemu:commandline> <qemu:arg value='-chardev'/> <qemu:arg value='file,path=/tmp/seabioslog-vm1.log,id=seabios'/> <qemu:arg value='-device'/> <qemu:arg value='isa-debugcon,iobase=0x402,chardev=seabios'/> </qemu:commandline> </domain>