On Fri, Apr 15, 2022, David Matlack wrote: > On Thu, Apr 14, 2022 at 5:51 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h > > index 1bff453f7cbe..c0e502b17ef7 100644 > > --- a/arch/x86/kvm/mmu/mmu_internal.h > > +++ b/arch/x86/kvm/mmu/mmu_internal.h > > @@ -143,6 +143,7 @@ unsigned int pte_list_count(struct kvm_rmap_head *rmap_head); > > /* > > * Return values of handle_mmio_page_fault, mmu.page_fault, and fast_page_fault(). > > * > > + * RET_PF_CONTINUE: So far, so good, keep handling the page fault. > > * RET_PF_RETRY: let CPU fault again on the address. > > * RET_PF_EMULATE: mmio page fault, emulate the instruction directly. > > * RET_PF_INVALID: the spte is invalid, let the real page fault path update it. > > RET_PF_CONTINUE and RET_PF_INVALID are pretty similar, they both > indicate to the PF handler that it should keep going. What do you > think about taking this a step further and removing RET_PF_INVALID and > RET_PF_CONTINUE and using 0 instead? RET_PF_INVALID is useful because it allows kvm_mmu_page_fault() to sanity check that the page fault handler didn't barf: r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, lower_32_bits(error_code), false); if (KVM_BUG_ON(r == RET_PF_INVALID, vcpu->kvm)) return -EIO; It's a bit odd that fast_page_fault() returns RET_PF_INVALID instead of RET_PF_CONTINUE, but at the same time the fault _is_ indeed invalid for fast handling. > That way we can replace: > > if (ret != RET_PF_CONTINUE) > return r; > > and > > if (ret != RET_PF_INVALID) > return r; > > with > > if (r) > return r; > > ? We could actually do that now, at least for RET_PF_CONTINUE, but I deliberately avoided doing that so as to not take a hard dependency on the underlying value of RET_PF_CONTINUE. KVM's magic "0 == exit to userspace, 1 == resume guest" logic is dangerous, e.g. it's caused real bugs in the past. But in that case, the return value is multiplexed with -errno, i.e. there's a good reason for using magic values. For this code, there's no such requirement because the caller must convert the RET_PF_* return to the aforementioned magic returns. And the even bigger motivation was to discourage "if (r)" for this case because that suggests that this code _does_ utilize KVM's magic return value. I actually wrote it that way at first and then got completely turned around and forgot what value was being returned :-) Heh, and thinking about it, I'd be tempted to assign RET_PF_CONTINUE a non-zero value if some debug Kconfig is enabled to really enforce that KVM explicitly checks for RET_PF_CONTINUE instead of assuming a non-zero value means "bail". > > @@ -151,9 +152,15 @@ unsigned int pte_list_count(struct kvm_rmap_head *rmap_head); > > * > > * Any names added to this enum should be exported to userspace for use in > > * tracepoints via TRACE_DEFINE_ENUM() in mmutrace.h > > Please add RET_PF_CONTINUE to mmutrace.h, e.g. > > TRACE_DEFINE_ENUM(RET_PF_CONTINUE); > > It doesn't matter in practice (yet) since the trace enums are only > used for trace_fast_page_fault, which does not return RET_PF_CONTINUE. > But it would be good to keep it up to date. Drat, missed that. Thanks!