G'day all,
I'm at a bit of a loss. I run a CCTV storage server on a Windows XP
guest running on an AMD64 (Piledriver) host with 64 bit kernel and
userspace. This has been running well for a number of years.
I recently upgraded from a 3.15 kernel to 3.17 and now 3.18. The 3.15
kernel runs fine. Running 3.17 or later causes the VM to latch up after
between 30 minutes and 24 hours running. I say latch up rather than lock
up as if I run virt-viewer and wiggle the mouse around it springs back
to life, although the system clock is now hours behind. I can see when
it latches up as I have cacti monitoring it's network interface. The
system clock stops dead as soon as the traffic does.
When it latches up, qemu spins using 100% of however many cores I have
allocated. It does not respond to the virtio network adapter, and the
system clock ceases ticking.
I have several XP VM's on this machine, all configured with the same
hardware (virtio storage & network) and only the CCTV VM exhibits this
behavior.
Reverting back to 3.15.x makes it go away. I've not had the opportunity
to try 3.16 yet but it's on my todo list for this weekend.
I've tried several versions of qemu, and I'm currently on git HEAD. The
problem tracks the kernel.
This is a production box, and given it can take hours to reproduce I
can't really run a bisect on it. I've also been unable to replicate it
thus far on an "identical" test machine, although I'll continue to try
as it'll make bisecting more viable.
It's almost as if the guest stops responding to interrupts, yet console
activity wakes it back up.
I'm only posting this in the vain hope it pricks someones spidey senses
and gives me a bit of a leg up in debugging it.
I'm using guest drivers from spice-guest-tools-0.74 on all the guests.
Regards,
Brad
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html