2017-06-16 23:38 GMT+08:00 Radim Krčmář <rkrcmar@xxxxxxxxxx>: > 2017-06-16 22:24+0800, Wanpeng Li: >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář <rkrcmar@xxxxxxxxxx>: >> > 2017-06-14 19:26-0700, Wanpeng Li: >> >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an async >> >> page fault, and constructs the expected vm-exit information fields. Force >> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF >> >> is async page fault. >> >> >> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> --- >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp); >> >> void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) >> >> { >> >> ++vcpu->stat.pf_guest; >> >> - vcpu->arch.cr2 = fault->address; >> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault; >> > >> > I think we need to act as if arch.exception.async_page_fault was not >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events(). Otherwise, if we >> > migrate with pending async_page_fault exception, we'd inject it as a >> > normal #PF, which could confuse/kill the nested guest. >> > >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for >> > sanity as well. >> >> Do you mean we should add a field like async_page_fault to >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and >> restores events->exception.async_page_fault to >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS? > > No, I thought we could get away with a disgusting hack of hiding the > exception from userspace, which would work for migration, but not if > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ... > > Extending the userspace interface would work, but I'd do it as a last > resort, after all conservative solutions have failed. > async_pf migration is very crude, so exposing the exception is just an > ugly workaround for the local case. Adding the flag would also require > userspace configuration of async_pf features for the guest to keep > compatibility. > > I see two options that might be simpler than adding the userspace flag: > > 1) do the nested VM exit sooner, at the place where we now queue #PF, > 2) queue the #PF later, save the async_pf in some intermediate > structure and consume it at the place where you proposed the nested > VM exit. How about something like this to not get exception events if it is "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR && vcpu->arch.exception.async_page_fault" since lost a reschedule optimization is not that importmant in L1. @@ -3072,13 +3074,16 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { process_nmi(vcpu); - events->exception.injected = - vcpu->arch.exception.pending && - !kvm_exception_is_soft(vcpu->arch.exception.nr); - events->exception.nr = vcpu->arch.exception.nr; - events->exception.has_error_code = vcpu->arch.exception.has_error_code; - events->exception.pad = 0; - events->exception.error_code = vcpu->arch.exception.error_code; + if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR && + vcpu->arch.exception.async_page_fault)) { + events->exception.injected = + vcpu->arch.exception.pending && + !kvm_exception_is_soft(vcpu->arch.exception.nr); + events->exception.nr = vcpu->arch.exception.nr; + events->exception.has_error_code = vcpu->arch.exception.has_error_code; + events->exception.pad = 0; + events->exception.error_code = vcpu->arch.exception.error_code; + } Regards, Wanpeng Li