[Bug 218684] CPU soft lockups in KVM VMs on kernel 6.x after switching hypervisor from C8S to C9S

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=218684

Frantisek Sumsal (frantisek@xxxxxxxxx) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |PATCH_ALREADY_AVAILABLE

--- Comment #2 from Frantisek Sumsal (frantisek@xxxxxxxxx) ---
(In reply to Sean Christopherson from comment #1)
> On Fri, Apr 05, 2024, bugzilla-daemon@xxxxxxxxxx wrote:
<...snip...>
> 
> Hmm, the vCPU is stuck in the idle HLT loop, which suggests that the vCPU
> isn't
> waking up when it should.  But it does obviously get the hrtimer interrupt,
> so
> it's not completely hosed.
> 
> Are you able to test custom kernels?  If so, bisecting the host kernel is
> likely
> the easiest way to figure out what's going on.  It might not be the
> _fastest_,
> but it should be straightforward, and shouldn't require much KVM expertise,
> i.e.
> won't require lengthy back-and-forth discussions if no one immediately spots
> a
> bug.
> 
> And before bisecting, it'd be worth seeing if an upstream host kernel has the
> same problem, e.g. if upstream works, it might be easier/faster to bisect to
> a
> fix, than to bisect to a bug.

I did some tests over the weekend, and after installing the latest-ish mainline
kernel on the host (6.9.0-0.rc1.316.vanilla.fc40.x86_64, ignore the fc40 part,
I was just lazy and used [0] for a quick test) the soft lockups disappear
completely. I really should've tried this before filing an issue - I tried just
6.7.1-0.hs1.hsx.el9.x86_64 (from [1]) and that didn't help, so I mistakenly
assumed that it's not the host kernel who's at fault.

Also, with the mainline kernel on the host, I can now use the "stock" Arch
Linux kernel on the guest as well without any soft lockups.

Given the mainline kernel works as expected I'll go ahead and move this issue
to the RHEL downstream (and bisect the kernel to find out what's the fix).
Thanks a lot for nudging me into the right direction!

[0] https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories
[1] https://sig.centos.org/hyperscale/

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux