Re: XP machine freeze

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 16, 2015 at 04:10:40PM +0100, Saso Slavicic wrote:
> Hi,
> 
> I'm fairly experienced with KVM (Centos 5/6), running about a dozen servers
> with 20-30 different (Linux & MS platform) systems.
> I have one Windows XP machine that acts very strangely - it freezes. I get
> ping timeout for the VM from my monitoring and the machine spins 2 or 3
> cores using all the cpu. Now the interesting thing that happens is that once
> you open the console, it suddenly starts working again. You can see the
> clock catching up as it was frozen in time and everything works normally
> once the timer catches up. It usually happens probably about once a month,
> although it happened yesterday and today again.
> 
> This machine is on Centos 6, qemu-kvm-0.12.1.2-2.448.el6_6, kernel
> 2.6.32-504.3.3.el6.x86_64.
> I was able to do some debugging when the machine was frozen, so I got some
> things to work with:
> 
> # virsh qemu-monitor-command --hmp DBserver 'info cpus'
> * CPU #0: pc=0x0000000080501fdd thread_id=32595
>   CPU #1: pc=0x00000000806e7a9b thread_id=32596
>   CPU #2: pc=0x00000000ba2da162 (halted) thread_id=32597
>   CPU #3: pc=0x00000000ba2da162 (halted) thread_id=32598
> 
> Now, in both yesterday's and today's event the CPU0 was stopped at
> 0x0000000080501fdd. I've disassembled the function and got this:
> 
>  0x0000000080501fb5:  int3
>  0x0000000080501fb6:  mov    %edi,%edi
>  0x0000000080501fb8:  push   %ebp
>  0x0000000080501fb9:  mov    %esp,%ebp
>  0x0000000080501fbb:  push   %esi
>  0x0000000080501fbc:  mov    %fs:0x20,%eax
>  0x0000000080501fc2:  mov    0x8(%ebp),%ecx
>  0x0000000080501fc5:  lea    -0x1(%ecx),%esi
>  0x0000000080501fc8:  test   %esi,%ecx
>  0x0000000080501fca:  lea    0x7ec(%eax),%edx
>  0x0000000080501fd0:  pop    %esi
>  0x0000000080501fd1:  je     0x80501fdd
>  0x0000000080501fd3:  lea    0x7a0(%eax),%edx
>  0x0000000080501fd9:  jmp    0x80501fdd
>  *0x0000000080501fdb:  pause
>  0x0000000080501fdd:  cmpl   $0x0,(%edx)
>  0x0000000080501fe0:  jne    0x80501fdb
>  0x0000000080501fe2:  pop    %ebp
>  0x0000000080501fe3:  ret    $0x4
>  0x0000000080501fe6:  int3
> 
> Mov %edi,%edi is clearly the start of some function. From what I've been
> able to understand, the code fetches _KPRCB structure (%fs:0x20) and then
> does a spinlock between fdb and fe0 checking for PacketBarrier (?) in EDX
> (0xffdff8c0). Now, $pc always shows fdd address, shouldn't it jump between
> fdb and fe0, it seems as if it was stuck at fdd?
> 
> # virsh qemu-monitor-command --hmp DBserver 'info registers'
>  EAX=ffdff120 EBX=c06ddf58 ECX=0000000e EDX=ffdff8c0
>  ESI=be6e3921 EDI=c06ddf60 EBP=ba4ff708 ESP=ba4ff708
>  EIP=80501fdd EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>  ES =0023 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
>  CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>  SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  DS =0023 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
>  FS =0030 ffdff000 00001fff 00c09300 DPL=0 DS   [-WA]
>  GS =0000 00000000 000fffff 00000000
>  LDT=0000 00000000 000fffff 00000000
>  TR =0028 80042000 000020ab 00008b00 DPL=0 TSS32-busy
>  GDT=     8003f000 000003ff
>  IDT=     8003f400 000007ff
>  CR0=8001003b CR2=dbbec000 CR3=0b3c0020 CR4=000006f8
>  DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
>  DR6=ffff0ff0 DR7=00000400
>  FCW=027f FSW=0020 [ST=0] FTW=00 MXCSR=00001fa0
>  FPR0=8053632b003c1658 c048 FPR1=e1e0c048bf80f6ab 76f8
>  FPR2=e1e0000000000000 0023 FPR3=0b017c30003c1658 0000
>  FPR4=0000003bba1a7604 1e64 FPR5=0007268c00000000 003b
>  FPR6=000002020000001b 2684 FPR7=e3e0a9b4e1b50de4 ca0b
>  XMM00=0000000000a1fc95000000000020027f
> XMM01=0000ffff00001fa000001c4c00000001
>  XMM02=000000000000c0488053632b003c1658
> XMM03=00000000000076f8e1e0c048bf80f6ab
>  XMM04=0000000000000023e1e0000000000000
> XMM05=00000000000000000b017c30003c1658
>  XMM06=0000000000001e640000003bba1a7604
> XMM07=000000000000003b0007268c00000000
> 
> Clearly, the address in EDX is not 0:
> 
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x0e
> 
> [root@linux ~]# virt-manager
> 
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x00
> 
> However as soon as the VM console is opened and machine starts, the address
> in EDX is set to 0 and the loop is broken.
> Does anybody recognize what function that is? What could possibly happen
> that opening the console and moving the mouse a little, unfreezes the
> machine?
> VM has .81 virtio drivers from Fedora repo at the moment.

Generate a Windows dump? 

https://support.microsoft.com/en-us/kb/254649

https://support.microsoft.com/en-us/kb/972110
Step 7: Generate a complete crash dump file or a kernel crash dump file
by using an NMI on a Windows-based system

(you can inject NMIs via QEMU monitor).

> 
> The configuration of the machine is pretty standard:
> 
> <!--
> WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
> OVERWRITTEN AND LOST. Changes to this xml configuration should be made
> using:
>   virsh edit DBserver
> or other application using the libvirt API.
> -->
> 
>  <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
>   <name>DBserver</name>
>   <uuid>e42b4cf2-7264-515f-4d24-6267eaa24be8</uuid>
>   <memory unit='KiB'>3145728</memory>
>   <currentMemory unit='KiB'>3145728</currentMemory>
>   <vcpu placement='static'>4</vcpu>
>   <os>
>     <type arch='x86_64' machine='rhel6.6.0'>hvm</type>
>     <boot dev='hd'/>
>   </os>
>   <features>
>     <acpi/>
>     <apic/>
>     <pae/>
>   </features>
>   <cpu>
>     <topology sockets='1' cores='4' threads='4'/>
>   </cpu>
>   <clock offset='localtime'>
>     <timer name='rtc' tickpolicy='catchup'/>
>   </clock>
>   <on_poweroff>destroy</on_poweroff>
>   <on_reboot>restart</on_reboot>
>   <on_crash>restart</on_crash>
>   <devices>
>     <emulator>/usr/libexec/qemu-kvm</emulator>
>     <disk type='block' device='disk'>
>       <driver name='qemu' type='raw' cache='none' io='native'/>
>       <source dev='/dev/drbd1'/>
>       <target dev='vda' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> function='0x0'/>
>     </disk>
>     <disk type='block' device='disk'>
>       <driver name='qemu' type='raw' cache='none' io='native'/>
>       <source
> dev='/dev/disk/by-id/usb-WD_Ext_HDD_1021_574D415A4138353838383731-0:0'/>
>       <target dev='vdb' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
> function='0x0'/>
>     </disk>
>     <disk type='file' device='cdrom'>
>       <driver name='qemu' type='raw'/>
>       <target dev='hdc' bus='ide'/>
>       <readonly/>
>       <address type='drive' controller='0' bus='1' target='0' unit='0'/>
>     </disk>
>     <controller type='usb' index='0' model='ich9-ehci1'>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x7'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci1'>
>       <master startport='0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0' multifunction='on'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci2'>
>       <master startport='2'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x1'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci3'>
>       <master startport='4'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x2'/>
>     </controller>
>     <controller type='ide' index='0'>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> function='0x1'/>
>     </controller>
>     <interface type='bridge'>
>       <mac address='52:54:00:a6:92:ca'/>
>       <source bridge='br0'/>
>       <model type='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> function='0x0'/>
>     </interface>
>     <serial type='pty'>
>       <target port='0'/>
>     </serial>
>     <console type='pty'>
>       <target type='serial' port='0'/>
>     </console>
>     <input type='mouse' bus='ps2'/>
>     <graphics type='vnc' port='-1' autoport='yes'/>
>     <video>
>       <model type='vga' vram='9216' heads='1'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>     </video>
>     <memballoon model='virtio'>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
> function='0x0'/>
>     </memballoon>
>   </devices>
>   <qemu:commandline>
>     <qemu:arg value='-set'/>
>     <qemu:arg value='device.virtio-disk0.x-data-plane=on'/>
>   </qemu:commandline>
>  </domain>
> 
> The above config is already changed as I've first experimented with removing
> usb tablet (and installing vmware mouse drivers), turning 'x-data-plane on'
> and so on, hoping to solve the problem...Is there anything else I can check
> the next time the machine freezes?
> 
> Regards,
> Saso Slavicic
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux