On Sun, Sep 03, 2023, Li zeming wrote: > sptep is assigned first, so it does not need to initialize the assignment. > > Signed-off-by: Li zeming <zeming@xxxxxxxxxxxx> > --- > arch/x86/kvm/mmu/mmu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index ec169f5c7dce..95f745aec4aa 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -3457,7 +3457,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > struct kvm_mmu_page *sp; > int ret = RET_PF_INVALID; > u64 spte = 0ull; > - u64 *sptep = NULL; > + u64 *sptep; > uint retry_count = 0; > > if (!page_fault_can_be_fast(fault)) Hmm, this is safe, but there's some ugliness lurking. Theoretically, it's possible for spte to be left untouched by the walkers. That _shouldn't_ happen, as it means there's a bug somewhere in KVM. But if that did happen, on the second or later iteration, it's (again, theoretically) possible to consume a stale spte. if (tdp_mmu_enabled) sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte); else sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte); if (!is_shadow_present_pte(spte)) <=== could consume stale data break; If we're going to tidy up sptep, I think we should also give spte similar treatment and harden KVM in the process, e.g. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6325bb3e8c2b..ae2f87bbbf0a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3430,8 +3430,8 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu_page *sp; int ret = RET_PF_INVALID; - u64 spte = 0ull; - u64 *sptep = NULL; + u64 spte; + u64 *sptep; uint retry_count = 0; if (!page_fault_can_be_fast(fault)) @@ -3447,6 +3447,14 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) else sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte); + /* + * It's entirely possible for the mapping to have been zapped + * by a different task, but the root page is should always be + * available as the vCPU holds a reference to its root(s). + */ + if (WARN_ON_ONCE(!sptep)) + spte = REMOVED_SPTE; + if (!is_shadow_present_pte(spte)) break;