On Wed, Apr 17, 2024 at 11:24 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Apr 17, 2024, Paolo Bonzini wrote: > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > Introduce a helper function to call the KVM fault handler. It allows a new > > ioctl to invoke the KVM fault handler to populate without seeing RET_PF_* > > enums or other KVM MMU internal definitions because RET_PF_* are internal > > to x86 KVM MMU. The implementation is restricted to two-dimensional paging > > for simplicity. The shadow paging uses GVA for faulting instead of L1 GPA. > > It makes the API difficult to use. > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@xxxxxxxxx> > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > --- > > arch/x86/kvm/mmu.h | 3 +++ > > arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++++++++++++++ > > 2 files changed, 35 insertions(+) > > > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > > index e8b620a85627..51ff4f67e115 100644 > > --- a/arch/x86/kvm/mmu.h > > +++ b/arch/x86/kvm/mmu.h > > @@ -183,6 +183,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, > > __kvm_mmu_refresh_passthrough_bits(vcpu, mmu); > > } > > > > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > > + u8 *level); > > + > > /* > > * Check if a given access (described through the I/D, W/R and U/S bits of a > > * page fault error code pfec) causes a permission fault with the given PTE > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 7fbcfc97edcc..fb2149d16f8d 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -4646,6 +4646,38 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > return direct_page_fault(vcpu, fault); > > } > > > > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > > + u8 *level) > > If the return is an overloaded "long", then there's no need for @level, i.e. do > the level=>size conversion in this helper. > > > +{ > > + int r; > > + > > + /* Restrict to TDP page fault. */ > > Do we want to restrict this to the TDP MMU? Not for any particular reason, mostly > just to keep moving towards officially deprecating/removing TDP support from the > shadow MMU. Heh, yet another thing I briefly thought about while reviewing Isaku's work. In the end I decided that, with the implementation being just a regular prefault, there's not much to save from keeping the shadow MMU away from this. The real ugly part is that if the memslots are zapped the pre-population effect basically goes away (damn kvm_arch_flush_shadow_memslot). This is the reason why I initially thought of KVM_CHECK_EXTENSION for the VM file descriptor, to only allow this for TDX VMs. The real solution for this is to not "graduate" this ioctl too soon to kvm/next. Let's keep it in kvm-coco-queue until TDX is ready and then make a final decision. Paolo > > + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) > > + return -EOPNOTSUPP; > > + > > + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level); > > + if (r < 0) > > + return r; > > + > > + switch (r) { > > + case RET_PF_RETRY: > > + return -EAGAIN; > > + > > + case RET_PF_FIXED: > > + case RET_PF_SPURIOUS: > > + return 0; > > Going with the "long" idea, this becomes: > > end = (gpa & KVM_HPAGE_MASK(level)) + KVM_HPAGE_SIZE(level); > return min(size, end - gpa); > > though I would vote for a: > > break; > > so that the happy path is nicely isolated at the end of the function. > > > + > > + case RET_PF_EMULATE: > > + return -EINVAL; > > + > > + case RET_PF_CONTINUE: > > + case RET_PF_INVALID: > > + default: > > + WARN_ON_ONCE(r); > > + return -EIO; > > + } > > +} > > + > > static void nonpaging_init_context(struct kvm_mmu *context) > > { > > context->page_fault = nonpaging_page_fault; > > -- > > 2.43.0 > > > > >