On Fri, Dec 1, 2023 at 1:30 PM Kalra, Ashish <ashish.kalra@xxxxxxx> wrote: > > On 12/1/2023 1:02 PM, Mingwei Zhang wrote: > > On Fri, Dec 1, 2023 at 10:05 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > >> > >> On Fri, Nov 10, 2023, Jacky Li wrote: > >>> The cache flush operation in sev guest memory reclaim events was > >>> originally introduced to prevent security issues due to cache > >>> incoherence and untrusted VMM. However when this operation gets > >>> triggered, it causes performance degradation to the whole machine. > >>> > >>> This cache flush operation is performed in mmu_notifiers, in particular, > >>> in the mmu_notifier_invalidate_range_start() function, unconditionally > >>> on all guest memory regions. Although the intention was to flush > >>> cache lines only when guest memory was deallocated, the excessive > >>> invocations include many other cases where this flush is unnecessary. > >>> > >>> This RFC proposes using the mmu notifier event to determine whether a > >>> cache flush is needed. Specifically, only do the cache flush when the > >>> address range is unmapped, cleared, released or migrated. A bitmap > >>> module param is also introduced to provide flexibility when flush is > >>> needed in more events or no flush is needed depending on the hardware > >>> platform. > >> > >> I'm still not at all convinced that this is worth doing. We have clear line of > >> sight to cleanly and optimally handling SNP and beyond. If there is an actual > >> use case that wants to run SEV and/or SEV-ES VMs, which can't support page > >> migration, on the same host as traditional VMs, _and_ for some reason their > >> userspace is incapable of providing reasonable NUMA locality, then the owners of > >> that use case can speak up and provide justification for taking on this extra > >> complexity in KVM. > > > > Hi Sean, > > > > Jacky and I were looking at some cases like mmu_notifier calls > > triggered by the overloaded reason "MMU_NOTIFY_CLEAR". Even if we turn > > off page migration etc, splitting PMD may still happen at some point > > under this reason, and we will never be able to turn it off by > > tweaking kernel CONFIG options. So, I think this is the line of sight > > for this series. > > > > Handling SNP could be separate, since in SNP we have per-page > > properties, which allow KVM to know which page to flush individually. > > > > For SNP + gmem, where the HVA ranges covered by the MMU notifiers are > not acting on encrypted pages, we are ignoring MMU invalidation > notifiers for SNP guests as part of the SNP host patches being posted > upstream and instead relying on gmem own invalidation stuff to clean > them up on a per-folio basis. > > Thanks, > Ashish oh, I have no question about that. This series only applies to SEV/SEV-ES type of VMs. For SNP + guest_memfd, I don't see the implementation details, but I doubt you can ignore mmu_notifiers if the request does cover some encrypted memory in error cases or corner cases. Does the SNP enforce the usage of guest_memfd? How do we prevent exceptional cases? I am sure you guys already figured out the answers, so I don't plan to dig deeper until SNP host pages are accepted. Clearly, for SEV/SEV-ES, there is no such guarantee like guest_memfd. Applying guest_memfd on SEV/SEV-ES might require changes on SEV API I suspect, so I think that's equally non-trivial and thus may not be worth pursuing. Thanks. -Mingwei