2017-06-12 23:08-0700, Wanpeng Li: > From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > > Add an async_page_fault field to vcpu->arch.exception to identify an async > page fault, and constructs the expected vm-exit information fields. Force > a nested VM exit from nested_vmx_check_exception() if the injected #PF > is async page fault. > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> > Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > --- > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > @@ -2422,13 +2422,28 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) > static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) This function could use the same treatment as vmx_queue_exception(), so we are not mixing 'nr' with 'vcpu->arch.exception.*'. > { > struct vmcs12 *vmcs12 = get_vmcs12(vcpu); > + u32 intr_info = 0; > + unsigned long exit_qualification = 0; > > - if (!(vmcs12->exception_bitmap & (1u << nr))) > + if (!((vmcs12->exception_bitmap & (1u << nr)) || > + (nr == PF_VECTOR && vcpu->arch.exception.async_page_fault))) > return 0; > > + intr_info = nr | INTR_INFO_VALID_MASK; > + exit_qualification = vmcs_readl(EXIT_QUALIFICATION); This part still uses EXIT_QUALIFICATION(), which means it is not general and I think it would be nicer to just do simple special case on top: if (vcpu->arch.exception.async_page_fault) { vmcs_write32(VM_EXIT_INTR_ERROR_CODE, vcpu->arch.exception.error_code); nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, PF_VECTOR | INTR_TYPE_HARD_EXCEPTION | INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK, vcpu->arch.cr2); return 1; } Using vcpu->arch.cr2 is suspicious as VMX doesn't update CR2 on VM exits; isn't this going to change the CR2 visible in L2 guest after a nested VM entry? Btw. nested_vmx_check_exception() didn't support emulated exceptions at all (it only passed through ones we got from hardware), or have I missed something? Thanks.