On 15/02/2017 20:30, Jim Mattson wrote: > I like the idea of supporting just one guest page walker, but > KVM_TRANSLATE looks incomplete. For instance, it doesn't include the > access type, which makes me wonder how it deals with SMEP faults. > Also, it doesn't seem to have a way to return page fault information > to the caller, let alone EPT violation information if the VCPU is in > VMX non-root mode. Not surprising, I don't think anyone has actually used KVM_TRANSLATE in years... > (Is there currently a way for userspace to cause an > emulated EPT violation VM-exit from L2 to L1 as the result of > instruction emulation?) Yes, of course. There are two sets of function pointers for the MMU emulation while doing L2 emulation: vcpu->arch.nested_mmu is L2's page tables, while vcpu->arch.mmu is L1's EPT. vcpu->arch.walk_mmu points to one of these two, and it's where KVM starts walking page tables. While doing L2 emulation, vcpu->arch.walk_mmu is set to &vcpu->arch.nested_mmu (by nested_ept_init_mmu_context). Each memory access goes through the MMU's translate_gpa function which is dummy for arch.mmu and translate_nested_gpa for arch.nested_mmu. translate_nested_gpa in turn is simply vcpu->arch.mmu.gva_to_gpa, and that's how each L2 page table access causes a walk of the L1's EPT page tables. If something goes wrong, each MMU has its own inject_page_fault callback. For EPT page tables, vcpu->arch.mmu.inject_page_fault is where an EPT vmexit is injected. The emulator then just exits through X86EMUL_PROPAGATE_FAULT. It actually works, see commit ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception", 2014-09-04) for an example of a bugfix in this area. One thing where we're lacking a bit is that translate_nested_gpa should have an argument for "translating translated guest address" vs. "translating guest page structure address", in order to set EXITINFO or exit qualification correctly. This is incorrect right now. Paolo