Re: [PATCH 7/8] KVM: gmem: Avoid race with kvm_gmem_release and mmu notifier

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 18 Aug 2023 11:15:28 -0700

On Tue, Aug 15, 2023, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> 
> Add slots_lock around kvm_flush_shadow_all().  kvm_gmem_release() via
> fput() and kvm_mmu_notifier_release() via mmput() can be called
> simultaneously on process exit because vhost, /dev/vhost_{net, vsock}, can
> delay the call to release mmu_notifier, kvm_mmu_notifier_release() by its
> kernel thread.  Vhost uses get_task_mm() and mmput() for the kernel thread
> to access process memory.  mmput() can defer after closing the file.
> 
> kvm_flush_shadow_all() and kvm_gmem_release() can be called simultaneously.

KVM shouldn't reclaim memory on file release, it should instead do that on the
inode being "evicted": https://lore.kernel.org/all/ZLGiEfJZTyl7M8mS@xxxxxxxxxx

> With TDX KVM, HKID releasing by kvm_flush_shadow_all() and private memory
> releasing by kvm_gmem_release() can race.  Add slots_lock to
> kvm_mmu_notifier_release().

No, the right answer is to not release the HKID until the VM is destroyed.  gmem
has a reference to its associated kvm instance, and so that will naturally ensure
memory all memory encrypted with the HKID is freed before the HKID is released.
kvm_flush_shadow_all() should only tear down page tables, it shouldn't be freeing
guest_memfd memory.

Then patches 6-8 go away.