Re: [Bug 218259] High latency in KVM guests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 19, 2023, bugzilla-daemon@xxxxxxxxxx wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=218259
> 
> --- Comment #6 from Joern Heissler (kernelbugs2012@xxxxxxxxxxxxxxxxx) ---
> (In reply to Sean Christopherson from comment #5)
> 
> > This is likely/hopefully the same thing Yan encountered[1].  If you are able
> > to
> > test patches, the proposed fix[2] applies cleanly on v6.6 (note, I need to
> > post a
> > refreshed version of the series regardless), any feedback you can provide
> > would
> > be much appreciated.
> > 
> > [1] https://lore.kernel.org/all/ZNnPF4W26ZbAyGto@xxxxxxxxxxxxxxxxxxxxxxxxx
> > [2] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@xxxxxxxxxx
> 
> I admit that I don't understand most of what's written in the those threads.

LOL, no worries, sometimes none of us understand what's written either ;-)

> I applied the two patches from [2] (excluding [3]) to v6.6 and it appears to
> solve the problem.
> 
> However I haven't measured how/if any of the changes/flags affect performance
> or if any other problems are caused. After about 1 hour uptime it appears to be
> okay.

Don't worry too much about additional testing.  Barring a straight up bug (knock
wood), the changes in those patches have a very, very low probability of
introducing unwanted side effects.

> > KVM changes aside, I highly recommend evaluating whether or not NUMA
> > autobalancing is a net positive for your environment.  The interactions
> > between
> > autobalancing and KVM are often less than stellar, and disabling
> > autobalancing
> > is sometimes a completely legitimate option/solution.
> 
> I'll have to evaluate multiple options for my production environment.
> Patching+Building the kernel myself would only be a last resort. And it will
> probably take a while until Debian ships a patch for the issue. So maybe
> disable the NUMA balancing, or perhaps try to pin a VM's memory+cpu to a single
> NUMA node.

Another viable option is to disable the TDP MMU, at least until the above patches
land and are picked up by Debian.  You could even reference commit 7e546bd08943
("Revert "KVM: x86: enable TDP MMU by default"") from the v5.15 stable tree if
you want a paper trail that provides some justification as to why it's ok to revert
back to the "old" MMU.

Quoting from that:

  : As far as what is lost by disabling the TDP MMU, the main selling point of
  : the TDP MMU is its ability to service page fault VM-Exits in parallel,
  : i.e. the main benefactors of the TDP MMU are deployments of large VMs
  : (hundreds of vCPUs), and in particular delployments that live-migrate such
  : VMs and thus need to fault-in huge amounts of memory on many vCPUs after
  : restarting the VM after migration.

In other words, the old MMU is not broken, e.g. it didn't suddently become unusable
after 15+ years of use.  We enabled the newfangled TDP MMU by default because it
is the long-term replacement, e.g. it can scale to support use cases that the old
MMU falls over on, and we want to put the old MMU into maintenance-only mode.

But we are still ironing out some wrinkles in the TDP MMU, particularly for host
kernels that support preemption (the kernel has lock contention logic that is
unique to preemptible kernels).  And in the meantime, for most KVM use cases, the
old MMU is still perfectly servicable.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux