2017-06-14 2:55 GMT+08:00 Radim Krčmář <rkrcmar@xxxxxxxxxx>: > 2017-06-12 23:08-0700, Wanpeng Li: >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an async >> page fault, and constructs the expected vm-exit information fields. Force >> a nested VM exit from nested_vmx_check_exception() if the injected #PF >> is async page fault. >> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> --- >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> @@ -2422,13 +2422,28 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) >> static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) > > This function could use the same treatment as vmx_queue_exception(), so > we are not mixing 'nr' with 'vcpu->arch.exception.*'. > >> { >> struct vmcs12 *vmcs12 = get_vmcs12(vcpu); >> + u32 intr_info = 0; >> + unsigned long exit_qualification = 0; >> >> - if (!(vmcs12->exception_bitmap & (1u << nr))) >> + if (!((vmcs12->exception_bitmap & (1u << nr)) || >> + (nr == PF_VECTOR && vcpu->arch.exception.async_page_fault))) >> return 0; >> >> + intr_info = nr | INTR_INFO_VALID_MASK; >> + exit_qualification = vmcs_readl(EXIT_QUALIFICATION); > > This part still uses EXIT_QUALIFICATION(), which means it is not general > and I think it would be nicer to just do simple special case on top: > > if (vcpu->arch.exception.async_page_fault) { > vmcs_write32(VM_EXIT_INTR_ERROR_CODE, vcpu->arch.exception.error_code); > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, > PF_VECTOR | INTR_TYPE_HARD_EXCEPTION | > INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK, > vcpu->arch.cr2); > return 1; > } Good point. > > Using vcpu->arch.cr2 is suspicious as VMX doesn't update CR2 on VM > exits; isn't this going to change the CR2 visible in L2 guest after a > nested VM entry? Sorry, I don't fully understand the question. As you know this vcpu->arch.cr2 which includes token is set before async pf injection, and L1 will intercept it from EXIT_QUALIFICATION during nested vmexit, why it can change the CR2 visible in L2 guest after a nested VM entry? > > Btw. nested_vmx_check_exception() didn't support emulated exceptions at > all (it only passed through ones we got from hardware), I think so. Regards, Wanpeng Li