On Wed, Oct 20, 2021, Lai Jiangshan wrote: > On 2021/10/19 23:25, Sean Christopherson wrote: > I just read some interception policy in vmx.c, if EPT=1 but vmx_need_pf_intercept() > return true for some reasons/configs, #PF is intercepted. But CR3 write is not > intercepted, which means there will be an EPT fault _after_ (IIUC) the CR3 write if > the GPA of the new CR3 exceeds the guest maxphyaddr limit. And kvm queues a fault to > the guest which is also _after_ the CR3 write, but the guest expects the fault before > the write. > > IIUC, it can be fixed by intercepting CR3 write or reversing the CR3 write in EPT > violation handler. KVM implicitly does the latter by emulating the faulting instruction. static int handle_ept_violation(struct kvm_vcpu *vcpu) { ... /* * Check that the GPA doesn't exceed physical memory limits, as that is * a guest page fault. We have to emulate the instruction here, because * if the illegal address is that of a paging structure, then * EPT_VIOLATION_ACC_WRITE bit is set. Alternatively, if supported we * would also use advanced VM-exit information for EPT violations to * reconstruct the page fault error code. */ if (unlikely(allow_smaller_maxphyaddr && kvm_vcpu_is_illegal_gpa(vcpu, gpa))) return kvm_emulate_instruction(vcpu, 0); return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); } and injecting a #GP when kvm_set_cr3() fails. static int em_cr_write(struct x86_emulate_ctxt *ctxt) { if (ctxt->ops->set_cr(ctxt, ctxt->modrm_reg, ctxt->src.val)) return emulate_gp(ctxt, 0); /* Disable writeback. */ ctxt->dst.type = OP_NONE; return X86EMUL_CONTINUE; }