Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 07, 2022, František Šumšal wrote:
> Hello!
> 
> In our Arch Linux part of the upstream systemd CI I recently noticed an
> uptrend in CPU soft lockups when running one of our tests. This test runs
> several systemd-nspawn containers in succession and sometimes the underlying
> VM locks up due to a CPU soft lockup

By "underlying VM", do you mean L1 or L2?  Where

    L0 == Bare Metal
    L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
    L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)

> (just to clarify, the topology is: CentOS Stream 8 (baremetal,
> 4.18.0-305.3.1.el8) -> Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1) ->
> Arch Linux (nested KVM or QEMU TCG, happens with both,
> 5.19.5-arch1-1/5.19.7-arch1-1) -> nspawn containers).

Since this repros with TCG, that rules out nested KVM as the cuplrit.

> I did some further testing, and it reproduces even when the baremetal is my
> local Fedora 36 machine (5.17.12-300.fc36.x86_64).
> 
> Unfortunately, I can't provide a simple and reliable reproducer, as I can
> reproduce it only with that particular test and not reliably (sometimes it's
> the first iteration, sometimes it takes an hour or more to reproduce).
> However, I'd be more than glad to collect more information from one such
> machine, if possible.

...

> Also, in one instance, the machine died with:

Probably unrelated, but same question as above: which layer does "the machine"
refer to?



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux