Hi, I'm fairly experienced with KVM (Centos 5/6), running about a dozen servers with 20-30 different (Linux & MS platform) systems. I have one Windows XP machine that acts very strangely - it freezes. I get ping timeout for the VM from my monitoring and the machine spins 2 or 3 cores using all the cpu. Now the interesting thing that happens is that once you open the console, it suddenly starts working again. You can see the clock catching up as it was frozen in time and everything works normally once the timer catches up. It usually happens probably about once a month, although it happened yesterday and today again. This machine is on Centos 6, qemu-kvm-0.12.1.2-2.448.el6_6, kernel 2.6.32-504.3.3.el6.x86_64. I was able to do some debugging when the machine was frozen, so I got some things to work with: # virsh qemu-monitor-command --hmp DBserver 'info cpus' * CPU #0: pc=0x0000000080501fdd thread_id=32595 CPU #1: pc=0x00000000806e7a9b thread_id=32596 CPU #2: pc=0x00000000ba2da162 (halted) thread_id=32597 CPU #3: pc=0x00000000ba2da162 (halted) thread_id=32598 Now, in both yesterday's and today's event the CPU0 was stopped at 0x0000000080501fdd. I've disassembled the function and got this: 0x0000000080501fb5: int3 0x0000000080501fb6: mov %edi,%edi 0x0000000080501fb8: push %ebp 0x0000000080501fb9: mov %esp,%ebp 0x0000000080501fbb: push %esi 0x0000000080501fbc: mov %fs:0x20,%eax 0x0000000080501fc2: mov 0x8(%ebp),%ecx 0x0000000080501fc5: lea -0x1(%ecx),%esi 0x0000000080501fc8: test %esi,%ecx 0x0000000080501fca: lea 0x7ec(%eax),%edx 0x0000000080501fd0: pop %esi 0x0000000080501fd1: je 0x80501fdd 0x0000000080501fd3: lea 0x7a0(%eax),%edx 0x0000000080501fd9: jmp 0x80501fdd *0x0000000080501fdb: pause 0x0000000080501fdd: cmpl $0x0,(%edx) 0x0000000080501fe0: jne 0x80501fdb 0x0000000080501fe2: pop %ebp 0x0000000080501fe3: ret $0x4 0x0000000080501fe6: int3 Mov %edi,%edi is clearly the start of some function. From what I've been able to understand, the code fetches _KPRCB structure (%fs:0x20) and then does a spinlock between fdb and fe0 checking for PacketBarrier (?) in EDX (0xffdff8c0). Now, $pc always shows fdd address, shouldn't it jump between fdb and fe0, it seems as if it was stuck at fdd? # virsh qemu-monitor-command --hmp DBserver 'info registers' EAX=ffdff120 EBX=c06ddf58 ECX=0000000e EDX=ffdff8c0 ESI=be6e3921 EDI=c06ddf60 EBP=ba4ff708 ESP=ba4ff708 EIP=80501fdd EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] FS =0030 ffdff000 00001fff 00c09300 DPL=0 DS [-WA] GS =0000 00000000 000fffff 00000000 LDT=0000 00000000 000fffff 00000000 TR =0028 80042000 000020ab 00008b00 DPL=0 TSS32-busy GDT= 8003f000 000003ff IDT= 8003f400 000007ff CR0=8001003b CR2=dbbec000 CR3=0b3c0020 CR4=000006f8 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 FCW=027f FSW=0020 [ST=0] FTW=00 MXCSR=00001fa0 FPR0=8053632b003c1658 c048 FPR1=e1e0c048bf80f6ab 76f8 FPR2=e1e0000000000000 0023 FPR3=0b017c30003c1658 0000 FPR4=0000003bba1a7604 1e64 FPR5=0007268c00000000 003b FPR6=000002020000001b 2684 FPR7=e3e0a9b4e1b50de4 ca0b XMM00=0000000000a1fc95000000000020027f XMM01=0000ffff00001fa000001c4c00000001 XMM02=000000000000c0488053632b003c1658 XMM03=00000000000076f8e1e0c048bf80f6ab XMM04=0000000000000023e1e0000000000000 XMM05=00000000000000000b017c30003c1658 XMM06=0000000000001e640000003bba1a7604 XMM07=000000000000003b0007268c00000000 Clearly, the address in EDX is not 0: [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0' 00000000ffdff8c0: 0x0e [root@linux ~]# virt-manager [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0' 00000000ffdff8c0: 0x00 However as soon as the VM console is opened and machine starts, the address in EDX is set to 0 and the loop is broken. Does anybody recognize what function that is? What could possibly happen that opening the console and moving the mouse a little, unfreezes the machine? VM has .81 virtio drivers from Fedora repo at the moment. The configuration of the machine is pretty standard: <!-- WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: virsh edit DBserver or other application using the libvirt API. --> <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <name>DBserver</name> <uuid>e42b4cf2-7264-515f-4d24-6267eaa24be8</uuid> <memory unit='KiB'>3145728</memory> <currentMemory unit='KiB'>3145728</currentMemory> <vcpu placement='static'>4</vcpu> <os> <type arch='x86_64' machine='rhel6.6.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu> <topology sockets='1' cores='4' threads='4'/> </cpu> <clock offset='localtime'> <timer name='rtc' tickpolicy='catchup'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source dev='/dev/drbd1'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </disk> <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source dev='/dev/disk/by-id/usb-WD_Ext_HDD_1021_574D415A4138353838383731-0:0'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='hdc' bus='ide'/> <readonly/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/> </controller> <controller type='ide' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='52:54:00:a6:92:ca'/> <source bridge='br0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes'/> <video> <model type='vga' vram='9216' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='device.virtio-disk0.x-data-plane=on'/> </qemu:commandline> </domain> The above config is already changed as I've first experimented with removing usb tablet (and installing vmware mouse drivers), turning 'x-data-plane on' and so on, hoping to solve the problem...Is there anything else I can check the next time the machine freezes? Regards, Saso Slavicic -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html