Re: [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all

Sean Christopherson <seanjc@xxxxxxxxxx> · Tue, 14 Dec 2021 19:45:39 +0000

On Mon, Dec 13, 2021, Sean Christopherson wrote:
> On Mon, Dec 13, 2021, Paolo Bonzini wrote:
> > kvm_tdp_mmu_zap_all is intended to visit all roots and zap their page
> > tables, which flushes the accessed and dirty bits out to the Linux
> > "struct page"s.  Missing some of the roots has catastrophic effects,
> > because kvm_tdp_mmu_zap_all is called when the MMU notifier is being
> > removed and any PTEs left behind might become dangling by the time
> > kvm-arch_destroy_vm tears down the roots for good.
> > 
> > Unfortunately that is exactly what kvm_tdp_mmu_zap_all is doing: it
> > visits all roots via for_each_tdp_mmu_root_yield_safe, which in turn
> > uses kvm_tdp_mmu_get_root to skip invalid roots.  If the current root is
> > invalid at the time of kvm_tdp_mmu_zap_all, its page tables will remain
> > in place but will later be zapped during kvm_arch_destroy_vm.
> 
> As stated in the bug report thread[*], it should be impossible as for the MMU
> notifier to be unregistered while kvm_mmu_zap_all_fast() is running.
> 
> I do believe there's a race between set_nx_huge_pages() and kvm_mmu_notifier_release(),
> but that would result in the use-after-free kvm_set_pfn_dirty() tracing back to
> set_nx_huge_pages(), not kvm_destroy_vm().  And for that, I would much prefer we
> elevant mm->users while changing the NX hugepage setting.

Mwhahaha, race confirmed with a bit of hacking to force the issue.  I'll get a
patch out.