On 12/1/2023 1:02 PM, Mingwei Zhang wrote:
On Fri, Dec 1, 2023 at 10:05 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
On Fri, Nov 10, 2023, Jacky Li wrote:
The cache flush operation in sev guest memory reclaim events was
originally introduced to prevent security issues due to cache
incoherence and untrusted VMM. However when this operation gets
triggered, it causes performance degradation to the whole machine.
This cache flush operation is performed in mmu_notifiers, in particular,
in the mmu_notifier_invalidate_range_start() function, unconditionally
on all guest memory regions. Although the intention was to flush
cache lines only when guest memory was deallocated, the excessive
invocations include many other cases where this flush is unnecessary.
This RFC proposes using the mmu notifier event to determine whether a
cache flush is needed. Specifically, only do the cache flush when the
address range is unmapped, cleared, released or migrated. A bitmap
module param is also introduced to provide flexibility when flush is
needed in more events or no flush is needed depending on the hardware
platform.
I'm still not at all convinced that this is worth doing. We have clear line of
sight to cleanly and optimally handling SNP and beyond. If there is an actual
use case that wants to run SEV and/or SEV-ES VMs, which can't support page
migration, on the same host as traditional VMs, _and_ for some reason their
userspace is incapable of providing reasonable NUMA locality, then the owners of
that use case can speak up and provide justification for taking on this extra
complexity in KVM.
Hi Sean,
Jacky and I were looking at some cases like mmu_notifier calls
triggered by the overloaded reason "MMU_NOTIFY_CLEAR". Even if we turn
off page migration etc, splitting PMD may still happen at some point
under this reason, and we will never be able to turn it off by
tweaking kernel CONFIG options. So, I think this is the line of sight
for this series.
Handling SNP could be separate, since in SNP we have per-page
properties, which allow KVM to know which page to flush individually.
For SNP + gmem, where the HVA ranges covered by the MMU notifiers are
not acting on encrypted pages, we are ignoring MMU invalidation
notifiers for SNP guests as part of the SNP host patches being posted
upstream and instead relying on gmem own invalidation stuff to clean
them up on a per-folio basis.
Thanks,
Ashish