Re: KVM VM hangs forever in R state within lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 20, 2016 at 12:28 AM, Andrey Korolyov <andrey@xxxxxxx> wrote:
> Hi,
>
> I`ve observed this issue previously on an old 3.10 branch but wrote it
> off due to inability to reproduce in any meaningful way. Currently I
> am seeing it on 3.10 branch where all KVM-related and RCU-related
> issues are patched more or less for well-known issues.
>
> Way to obtain a problematic state:
>  - run a hypervisor for essentially long time, it took a year and half
> previously for issue to come on the mentioned old branch, but for
> newer kernel and probably due to higher load it took roughly a half of
> a year,
>  - suddenly a single VM obtains a lock and became unresponsive while
> all threads displaying Running state, under this lock VM is neither
> not killable via SIGKILL and not freezeable via freezer cgroup, the
> only obvious symptoms is that it does not consume any cpu cycles
> anymore (no counter inside sched info ) and of course it is
> non-debuggable anymore. As it follows, it is quite impossible to say
> at a glance where lock sits, as there is no distinctive processes
> which are at least sleeping and could be moved out of sight.
>
> It looks like I could have met pure scheduler issue, so if nothing
> from attached recursive stack/status dump would click on an idea, I`d
> CC scheduler folks. Timer/RCU configs are attached for the
> convenience.
>
> Thanks for looking into this!
>
> stack:
> http://xdel.ru/downloads/vm-sched-hang/stack.txt
> status:
> http://xdel.ru/downloads/vm-sched-hang/status.txt

-qemu-devel

So far the issue seems to be a matter of the errata of the specific
CPU series (E5 2620v1 made within first fwo quarters after official
announcement, though Intel erratum document for this CPU does not
contain any simular highlights). Nevertheless, the issue should be
dealt properly, but I am a bit out of any ideas on further debugging -
it is not clear scheduler lockup and the running state displayed by
the OS is nothing more than a side effect of this bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux