Re: [Bug 218259] High latency in KVM guests

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 18 Dec 2023 09:07:01 -0800

On Thu, Dec 14, 2023, bugzilla-daemon@xxxxxxxxxx wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=218259
> 
> --- Comment #2 from Joern Heissler (kernelbugs2012@xxxxxxxxxxxxxxxxx) ---
> Hi,
> 
> 1. KSM is already disabled. Didn't try to enable it.
> 2. NUMA autobalancing was enabled on the host (value 1), not in the guest. When
> disabled, I can't see the issue anymore.

This is likely/hopefully the same thing Yan encountered[1].  If you are able to
test patches, the proposed fix[2] applies cleanly on v6.6 (note, I need to post a
refreshed version of the series regardless), any feedback you can provide would
be much appreciated.

KVM changes aside, I highly recommend evaluating whether or not NUMA
autobalancing is a net positive for your environment.  The interactions between
autobalancing and KVM are often less than stellar, and disabling autobalancing
is sometimes a completely legitimate option/solution.

[1] https://lore.kernel.org/all/ZNnPF4W26ZbAyGto@xxxxxxxxxxxxxxxxxxxxxxxxx
[2] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@xxxxxxxxxx

> 3. tdp_mmu was "Y", disabling it seems to make no difference.

Hrm, that's odd.  The commit blamed by bisection was purely a TDP MMU change.
Did you relaunch VMs after disabling the module params?  While the module param
is writable, it's effectively snapshotted by each VM during creation, i.e. toggling
it won't affect running VMs.

> So might be related to NUMA. On older kernels, the flag is 1 as well.
> 
> There's one difference in the kernel messages that I hadn't noticed before. The
> newer one prints "pci_bus 0000:7f: Unknown NUMA node; performance will be
> reduced" (same with ff again). The older ones don't. No idea what this means,
> if it's important, and can't find info on the web regarding it.

That was a new message added by commit ad5086108b9f ("PCI: Warn if no host bridge
NUMA node info"), which was first released in v5.5.  AFAICT, that warning is only
complaning about the driver code for PCI devices possibly running on the wrong
node.

However, if you are seeing that error on v6.1 or v6.6, but not v5.17, i.e. if the
message started showing up well after the printk was added, then it might be a
symptom of an underlying problem, e.g. maybe the kernel is botching parsing of
ACPI tables?

> I think the kernel is preemptible:

Ya, not fully preemptible (voluntary only), but the important part is that KVM
will drop mmu_lock if there is contention (which is a "requirement" for the bug
that Yan encountered).