On 2021/8/13 00:03, Sean Christopherson wrote:
And for clarity and safety, I think it would be worth adding the patch below as a prep patch to document and enforce that walking the non-leaf SPTEs when faulting in a page should never terminate early.
I'v sent V2 patch which was updated as you suggested. The patch you provided below doesn't need to be updated. It and my V2 patch do not depend on each other, so I didn't resent your patch with mine together. For your patch Acked-by: Lai Jiangshan <jiangshanlai@xxxxxxxxx> And it can be queued earlier.
From 1c202a7e82b1931e4eb37b23aa9d7108340a6cd2 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Thu, 12 Aug 2021 08:38:35 -0700 Subject: [PATCH] KVM: x86/mmu: Verify shadow walk doesn't terminate early in page faults WARN and bail if the shadow walk for faulting in a SPTE terminates early, i.e. doesn't reach the expected level because the walk encountered a terminal SPTE. The shadow walks for page faults are subtle in that they install non-leaf SPTEs (zapping leaf SPTEs if necessary!) in the loop body, and consume the newly created non-leaf SPTE in the loop control, e.g. __shadow_walk_next(). In other words, the walks guarantee that the walk will stop if and only if the target level is reached by installing non-leaf SPTEs to guarantee the walk remains valid. Opportunistically use fault->goal-level instead of it.level in FNAME(fetch) to further clarify that KVM always installs the leaf SPTE at the target level. Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/kvm/mmu/mmu.c | 3 +++ arch/x86/kvm/mmu/paging_tmpl.h | 7 +++++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a272ccbddfa1..2a243ae1d64c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2992,6 +2992,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) account_huge_nx_page(vcpu->kvm, sp); } + if (WARN_ON_ONCE(it.level != fault->goal_level)) + return -EFAULT; + ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL, fault->write, fault->goal_level, base_gfn, fault->pfn, fault->prefault, fault->map_writable); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index f70afecbf3a2..3a8a7b2f9979 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -749,9 +749,12 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, } } + if (WARN_ON_ONCE(it.level != fault->goal_level)) + return -EFAULT; + ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, fault->write, - it.level, base_gfn, fault->pfn, fault->prefault, - fault->map_writable); + fault->goal_level, base_gfn, fault->pfn, + fault->prefault, fault->map_writable); if (ret == RET_PF_SPURIOUS) return ret; -- 2.33.0.rc1.237.g0d66db33f3-goog