On Mon, Apr 22, 2024, Xiaoyao Li wrote: > On 4/19/2024 4:59 PM, Paolo Bonzini wrote: > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > Extract out __kvm_mmu_do_page_fault() from kvm_mmu_do_page_fault(). The > > inner function is to initialize struct kvm_page_fault and to call the fault > > handler, and the outer function handles updating stats and converting > > return code. > > I don't see how the outer function converts return code. > > > KVM_PRE_FAULT_MEMORY will call the KVM page fault handler. > > I assume it means the inner function will be used by KVM_PRE_FAULT_MEMORY. > > > This patch makes the emulation_type always set irrelevant to the return > > code. kvm_mmu_page_fault() is the only caller of kvm_mmu_do_page_fault(), > > and references the value only when PF_RET_EMULATE is returned. Therefore, > > this adjustment doesn't affect functionality. > > This paragraph needs to be removed, I think. It's not true. It's oddly worded, but I do think it's correct. kvm_arch_async_page_ready() doesn't pass emulation_type, and kvm_mmu_page_fault() bails early for all other return values: if (r < 0) return r; if (r != RET_PF_EMULATE) return 1; That said, this belongs in a separate patch (if it's actually necessary). And _that_ said, rather than add an inner version, what if we instead shuffle the stats code? pf_taken, pf_spurious, and pf_emulate should _only_ ever be bumped by kvm_mmu_page_fault(), i.e. should only track page faults that actually happened in the guest. And so handling them in kvm_mmu_do_page_fault() doesn't make any sense, because there should only ever be one caller that passes prefetch=false. Compile tested only, and kvm_mmu_page_fault() is a bit ugly (but that's solvable), but I think this would provide better separation of concerns. -- From: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Wed, 12 Jun 2024 12:51:38 -0700 Subject: [PATCH 1/2] KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler Account stat.pf_taken in kvm_mmu_page_fault(), i.e. the actual page fault handler, instead of conditionally bumping it in kvm_mmu_do_page_fault(). The "real" page fault handler is the only path that should ever increment the number of taken page faults, as all other paths that "do page fault" are by definition not handling faults that occurred in the guest. Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/kvm/mmu/mmu.c | 2 ++ arch/x86/kvm/mmu/mmu_internal.h | 8 -------- 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f2c9580d9588..3461b8c4aba2 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5928,6 +5928,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err } if (r == RET_PF_INVALID) { + vcpu->stat.pf_taken++; + r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, error_code, false, &emulation_type); if (KVM_BUG_ON(r == RET_PF_INVALID, vcpu->kvm)) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index ce2fcd19ba6b..8efd31b3856b 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -318,14 +318,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, fault.slot = kvm_vcpu_gfn_to_memslot(vcpu, fault.gfn); } - /* - * Async #PF "faults", a.k.a. prefetch faults, are not faults from the - * guest perspective and have already been counted at the time of the - * original fault. - */ - if (!prefetch) - vcpu->stat.pf_taken++; - if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp) r = kvm_tdp_page_fault(vcpu, &fault); else base-commit: b7bc82a015e237862837bd1300d6ba1f5cd17466 -- 2.45.2.505.gda0bf45e8d-goog