On Fri, Jul 10, 2020 at 8:48 AM Mohammed Gamal <mgamal@xxxxxxxxxx> wrote: > > Check guest physical address against it's maximum physical memory. If > the guest's physical address exceeds the maximum (i.e. has reserved bits > set), inject a guest page fault with PFERR_RSVD_MASK set. > > This has to be done both in the EPT violation and page fault paths, as > there are complications in both cases with respect to the computation > of the correct error code. > > For EPT violations, unfortunately the only possibility is to emulate, > because the access type in the exit qualification might refer to an > access to a paging structure, rather than to the access performed by > the program. > > Trapping page faults instead is needed in order to correct the error code, > but the access type can be obtained from the original error code and > passed to gva_to_gpa. The corrections required in the error code are > subtle. For example, imagine that a PTE for a supervisor page has a reserved > bit set. On a supervisor-mode access, the EPT violation path would trigger. > However, on a user-mode access, the processor will not notice the reserved > bit and not include PFERR_RSVD_MASK in the error code. > > Co-developed-by: Mohammed Gamal <mgamal@xxxxxxxxxx> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++--- > arch/x86/kvm/vmx/vmx.h | 3 ++- > 2 files changed, 23 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 770b090969fb..de3f436b2d32 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -4790,9 +4790,15 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) > > if (is_page_fault(intr_info)) { > cr2 = vmx_get_exit_qual(vcpu); > - /* EPT won't cause page fault directly */ > - WARN_ON_ONCE(!vcpu->arch.apf.host_apf_flags && enable_ept); > - return kvm_handle_page_fault(vcpu, error_code, cr2, NULL, 0); > + if (enable_ept && !vcpu->arch.apf.host_apf_flags) { > + /* > + * EPT will cause page fault only if we need to > + * detect illegal GPAs. > + */ > + kvm_fixup_and_inject_pf_error(vcpu, cr2, error_code); > + return 1; > + } else > + return kvm_handle_page_fault(vcpu, error_code, cr2, NULL, 0); > } > > ex_no = intr_info & INTR_INFO_VECTOR_MASK; > @@ -5308,6 +5314,18 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; > > vcpu->arch.exit_qualification = exit_qualification; > + > + /* > + * Check that the GPA doesn't exceed physical memory limits, as that is > + * a guest page fault. We have to emulate the instruction here, because > + * if the illegal address is that of a paging structure, then > + * EPT_VIOLATION_ACC_WRITE bit is set. Alternatively, if supported we > + * would also use advanced VM-exit information for EPT violations to > + * reconstruct the page fault error code. > + */ > + if (unlikely(kvm_mmu_is_illegal_gpa(vcpu, gpa))) > + return kvm_emulate_instruction(vcpu, 0); > + Is kvm's in-kernel emulator up to the task? What if the instruction in question is AVX-512, or one of the myriad instructions that the in-kernel emulator can't handle? Ice Lake must support the advanced VM-exit information for EPT violations, so that would seem like a better choice. > return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > } > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index b0e5e210f1c1..0d06951e607c 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -11,6 +11,7 @@ > #include "kvm_cache_regs.h" > #include "ops.h" > #include "vmcs.h" > +#include "cpuid.h" > > extern const u32 vmx_msr_index[]; > > @@ -552,7 +553,7 @@ static inline bool vmx_has_waitpkg(struct vcpu_vmx *vmx) > > static inline bool vmx_need_pf_intercept(struct kvm_vcpu *vcpu) > { > - return !enable_ept; > + return !enable_ept || cpuid_maxphyaddr(vcpu) < boot_cpu_data.x86_phys_bits; > } > > void dump_vmcs(void); > -- > 2.26.2 >