Was running two guests on k3.7.10 but have now switched one to stock 2.6.32; and neither have crashed yet. Will leave running as and if it stays stable will switch out the other kernel to stock as-well. Am wondering if have hit a kernel buglet. Thank you for the ftrace info. Have a great weekend. ----- Original Message ----- From: "Gleb Natapov" <gleb@xxxxxxxxxx> To: "Phil Daws" <uxbod@xxxxxxxxxxxx> Cc: kvm@xxxxxxxxxxxxxxx Sent: Friday, 12 April, 2013 4:13:16 PM Subject: Re: KVM Guest Lock up (100%) again! On Fri, Apr 12, 2013 at 03:10:43PM +0100, Phil Daws wrote: > Well this is still happening ... I have tried to isolate what could be causing but not much luck yet. Thought the VMs may have been IO bound but that not the case and even tried upping the vCPU allocation from one to two as plenty of head room. When it locks up I see this on a strace: > > [pid 1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable) > [pid 1343] read(7, "\0", 512) = 1 > [pid 1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable) > [pid 1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 1 (in [16], left {0, 999981}) > [pid 1343] read(16, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128 > [pid 1343] rt_sigaction(SIGALRM, NULL, {0x7f210b2c0510, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7f210ac22500}, 8) = 0 > [pid 1343] write(8, "\0", 1) = 1 > [pid 1343] write(15, "\1\0\0\0\0\0\0\0", 8) = 8 > [pid 1343] read(16, 0x7fff82aec2d0, 128) = -1 EAGAIN (Resource temporarily unavailable) > [pid 1343] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0 > [pid 1343] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 656000000}}, NULL) = 0 > [pid 1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 2 (in [7 14], left {0, 999998}) > [pid 1343] read(14, "\1\0\0\0\0\0\0\0", 4096) = 8 > [pid 1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable) > [pid 1343] read(7, "\0", 512) = 1 > [pid 1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable) > > Does that shed any light ? Trying to find a how to for upgrading to the latest KVM/QEMU. > Is the lockup with upstream now? strace is not very helpful to diagnose kvm problems. Try to run ftrace: http://www.linux-kvm.org/page/Tracing -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html