Re: [PATCH v4 0/6] KVM: MMU: performance tweaks for heavy CR0.WP users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 14, 2023 at 09:49:28AM -0700, Sean Christopherson wrote:
> +Jeremi
> 

Adding myself :)

> On Fri, Apr 14, 2023, Mathias Krause wrote:

...

> > OTOH, the backports give nice speed-ups, ranging from ~2.2 times faster
> > for pure EPT (legacy) MMU setups up to 18(!!!) times faster for TDP MMU
> > on v5.10.
> 
> Anyone that's enabling the TDP MMU on v5.10 is on their own, we didn't enable the
> TDP MMU by default until v5.14 for very good reasons.
> 
> > I backported the whole series down to v5.10 but left out the CR0.WP
> > guest owning patch+fix for v5.4 as the code base is too different to get
> > all the nuances right, as Sean already hinted. However, even this
> > limited backport provides a big performance fix for our use case!
> 
> As a compromise of sorts, I propose that we disable the TDP MMU by default on v5.15,
> and backport these fixes to v6.1.  v5.15 and earlier won't get "ludicrous speed", but
> I think that's perfectly acceptable since KVM has had the suboptimal behavior
> literally since EPT/NPT support was first added.
> 

Disabling TDP MMU for v5.15, and backporting things to v6.1 works for me.

> I'm comfortable backporting to v6.1 as that is recent enough, and there weren't
> substantial MMU changes between v6.1 and v6.3 in this area.  I.e. I have a decent
> level of confidence that we aren't overlooking some subtle dependency.
> 
> For v5.15, I am less confident in the safety of a backport, and more importantly,
> I think we should disable the TDP MMU by default to mitigate the underlying flaw
> that makes the 18x speedup possible.  That flaw is that KVM can end up freeing and
> rebuilding TDP MMU roots every time CR0.WP is toggled or a vCPU transitions to/from
> SMM.
> 

The interesting thing here is that these CR0.WP fixes seem to improve things
with legacy MMU as well, and legacy MMU is not affected/touched by [3].

So I think you can consider Mathias' ask independent of disabling TDP MMU. On the one
hand: there is no regression here. On the other: the gain is big and seems important
to him.

I didn't have time to review these patches so I can't judge risk-benefit, or
whether any single patch might be a silver bullet on its own.

> We mitigated the CR0.WP case between v5.15 and v6.1[1], which is why v6.1 doesn't
> exhibit the same pain as v5.10, but Jeremi discovered that the SMM case badly affects
> KVM-on-HyperV[2], e.g. when lauching KVM guests using WSL.  I posted a fix[3] to
> finally resolve the underlying bug, but as Jeremi discovered[4], backporting the fix
> to v5.15 is going to be gnarly, to say the least.  It'll be far worse than backporting
> these CR0.WP patches, and maybe even infeasible without a large scale rework (no thanks).
> 
> Anyone that will realize meaningful benefits from the TDP MMU is all but guaranteed
> to be rolling their own kernels, i.e. can do the backports themselves if they want
> to use a v5.15 based kernel.  The big selling point of the TDP MMU is that it scales
> better to hundreds of vCPUs, particularly when live migrating such VMs.  I highly
> doubt that anyone running a stock kernel is running 100+ vCPU VMs, let alone trying
> to live migrate them.
> 
> [1] https://lkml.kernel.org/r/20220209170020.1775368-1-pbonzini%40redhat.com
> [2] https://lore.kernel.org/all/959c5bce-beb5-b463-7158-33fc4a4f910c@xxxxxxxxxxxxxxxxxxx
> [3] https://lore.kernel.org/all/20230413231251.1481410-1-seanjc@xxxxxxxxxx
> [4] https://lore.kernel.org/all/7332d846-fada-eb5c-6068-18ff267bd37f@xxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux