On Wed, Dec 4, 2024 at 2:07 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Dec 04, 2024, Kevin Loughlin wrote: > > On Tue, Dec 3, 2024 at 4:27 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > @@ -2152,7 +2191,7 @@ void sev_vm_destroy(struct kvm *kvm) > > > > * releasing the pages back to the system for use. CLFLUSH will > > > > * not do this, so issue a WBINVD. > > > > */ > > > > - wbinvd_on_all_cpus(); > > > > + sev_do_wbinvd(kvm); > > > > > > I am 99% certain this wbinvd_on_all_cpus() can simply be dropped. sev_vm_destroy() > > > is called after KVM's mmu_notifier has been unregistered, which means it's called > > > after kvm_mmu_notifier_release() => kvm_arch_guest_memory_reclaimed(). > > > > I think we need a bit of rework before dropping it (which I propose we > > do in a separate series), but let me know if there's a mistake in my > > reasoning here... > > > > Right now, sev_guest_memory_reclaimed() issues writebacks for SEV and > > SEV-ES guests but does *not* issue writebacks for SEV-SNP guests. > > Thus, I believe it's possible a SEV-SNP guest reaches sev_vm_destroy() > > with dirty encrypted lines in processor caches. Because SME_COHERENT > > doesn't guarantee coherence across CPU-DMA interactions (d45829b351ee > > ("KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT > > CPUs")), it seems possible that the memory gets re-allocated for DMA, > > written back from an (unencrypted) DMA, and then corrupted when the > > dirty encrypted version gets written back over that, right? > > > > And potentially the same thing for why we can't yet drop the writeback > > in sev_flush_encrypted_page() without a bit of rework? > > Argh, this last one probably does apply to SNP. KVM requires SNP VMs to be backed > with guest_memfd, and flushing for that memory is handled by sev_gmem_invalidate(). > But the VMSA is kernel allocated and so needs to be flushed manually. On the plus > side, the VMSA flush shouldn't use WB{NO}INVD unless things go sideways, so trying > to optimize that path isn't worth doing. Ah thanks, yes agreed for both (that dropping WB{NO}INVD is fine on the sev_vm_destroy() path given sev_gmem_invalidate() and that the sev_flush_encrypted_page() path still needs the WB{NO}INVD as a fallback for now). On that note, the WBINVD in sev_mem_enc_unregister_region() can be dropped too then, right? My understanding is that the host will instead do WB{NO}INVD for SEV(-ES) guests in sev_guest_memory_reclaimed(), and sev_gmem_invalidate() will handle SEV-SNP guests. All in all, I now agree we can drop the unneeded case(s) of issuing WB{NO}INVDs in this series in an additional commit. I'll then rebase [0] on the latest version of this series and can also work on the migration optimizations atop all of it, if that works for you Sean. [0] https://lore.kernel.org/lkml/20241203005921.1119116-1-kevinloughlin@xxxxxxxxxx/ Thanks!