On Wed, May 05, 2021, Sean Christopherson wrote: > On Wed, May 05, 2021, Kai Huang wrote: > > Currently pf_fixed is increased even when page fault requires emulation, > > or fault is spurious. Fix by only increasing it when return value is > > RET_PF_FIXED. > > > > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx> > > --- > > arch/x86/kvm/mmu/tdp_mmu.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > > index 1cad4c9f7c34..debe8c3ec844 100644 > > --- a/arch/x86/kvm/mmu/tdp_mmu.c > > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > > @@ -942,7 +942,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, > > rcu_dereference(iter->sptep)); > > } > > > > - if (!prefault) > > + if (!prefault && ret == RET_PF_FIXED) > > vcpu->stat.pf_fixed++; > For RET_PF_EMULATE, I could go either way. On one hand, KVM is installing a > translation that accelerates future emulated MMIO, so it's kinda sorta fixing > the page fault. On the other handle, future emulated MMIO still page faults, it > just gets handled without going through the full page fault handler. Hrm, the other RET_PF_EMULATE case is when KVM creates a _new_ SPTE to handle a page fault, but installs a read-only SPTE on a write fault because the page is marked for write access tracking, e.g. for non-leaf guest page tables. Bumping pf_fixed is arguably correct in that case since KVM did fault in a page and from the guest's perspective the page fault was fixed, it's just that "fixing" the fault involved a bit of instruction emulation. > If we do decide to omit RET_PF_EMULATE, it should be a separate patch and should > be done for all flavors of MMU. For this patch, the correct code is: > > if (ret != RET_PF_SPURIOUS) > vcpu->stat.pf_fixed++; > > which works because "ret" cannot be RET_PF_RETRY. > > Looking through the other code, KVM also fails to bump stat.pf_fixed in the fast > page fault case. So, if we decide to fix/adjust RET_PF_EMULATE, I think it would > make sense to handle stat.pf_fixed in a common location. Blech. My original thought was to move the stat.pf_fixed logic all the way out to kvm_mmu_do_page_fault(), but that would do the wrong thing if pf_fixed is bumped on RET_PF_EMULATE and page_fault_handle_page_track() returns RET_PF_EMULATE. That fast path handles the case where the guest gets a !WRITABLE page fault on an PRESENT SPTE that KVM is write tracking. *sigh*. I'm leaning towards making RET_PF_EMULATE a modifier instead of a standalone type, which would allow more precise pf_fixed adjustments and would also let us clean up the EMULTYPE_ALLOW_RETRY_PF logic, which has a rather gross check for detecting MMIO page faults. > The legacy MMU also prefetches on RET_PF_EMULATE, which isn't technically wrong, > but it's pretty much guaranteed to be a waste of time since prefetching only > installs SPTEs if there is a backing memslot and PFN. > > Kai, if it's ok with you, I'll fold the above "ret != RET_PF_SPURIOUS" change > into a separate mini-series to address the other issues I pointed out. I was > hoping I could paste patches for them inline and let you carry them, but moving > stat.pf_fixed handling to a common location requires additional code shuffling > because of async page faults :-/ Cancel that idea, given the twisty mess of RET_PF_EMULATE it's probably best for you to just send a new version of your patch to make the TDP MMU pf_fixed behavior match the legacy MMU. It doesn't make sense to hold up a trivial fix just so I can scratch a refactoring itch :-)