On Fri, Feb 07, 2025, Yan Zhao wrote: > Always free obsolete roots when pre-faulting SPTEs in case it's called > after a root is invalidated (e.g., by memslot removal) but before any > vcpu_enter_guest() processing of KVM_REQ_MMU_FREE_OBSOLETE_ROOTS. > > Lack of kvm_mmu_free_obsolete_roots() in this scenario can lead to > kvm_mmu_reload() failing to load a new root if the current root hpa is an > obsolete root (which is not INVALID_PAGE). Consequently, > kvm_arch_vcpu_pre_fault_memory() will retry infinitely due to the checking > of is_page_fault_stale(). > > It's safe to call kvm_mmu_free_obsolete_roots() even if there are no > obsolete roots or if it's called a second time when vcpu_enter_guest() > later processes KVM_REQ_MMU_FREE_OBSOLETE_ROOTS. This is because > kvm_mmu_free_obsolete_roots() sets an obsolete root to INVALID_PAGE and > will do nothing to an INVALID_PAGE. Why is userspace changing memslots while prefaulting? > > Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> > --- > arch/x86/kvm/mmu/mmu.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 47fd3712afe6..72f68458049a 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4740,7 +4740,12 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > /* > * reload is efficient when called repeatedly, so we can do it on > * every iteration. > + * Before reload, free obsolete roots in case the prefault is called > + * after a root is invalidated (e.g., by memslot removal) but > + * before any vcpu_enter_guest() processing of > + * KVM_REQ_MMU_FREE_OBSOLETE_ROOTS. > */ > + kvm_mmu_free_obsolete_roots(vcpu); > r = kvm_mmu_reload(vcpu); > if (r) > return r; I would prefer to do check for obsolete roots in kvm_mmu_reload() itself, but keep the main kvm_check_request() so that the common case handles the resulting TLB flush without having to loop back around in vcpu_enter_guest(). diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 050a0e229a4d..f2b36d32ef40 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -104,6 +104,9 @@ void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu) { + if (kvm_check_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu)) + kvm_mmu_free_obsolete_roots(vcpu); + /* * Checking root.hpa is sufficient even when KVM has mirror root. * We can have either: