It now looks like a bad idea to not restore eVMCS mapping directly from vmx_set_nested_state(). The restoration path now depends on whether KVM will continue executing L2 (vmx_get_nested_state_pages()) or will have to exit to L1 (nested_vmx_vmexit()), this complicates error propagation and diverges too much from the 'native' path when 'nested.current_vmptr' is set directly from vmx_get_nested_state_pages(). The existing solution postponing eVMCS mapping also seems to be fragile. In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not NULL to distinguish between eVMCS and non-eVMCS cases. All these checks are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped' state. Also, in case vmx_get_nested_state() is called right after vmx_set_nested_state() without executing the guest first, the resulting state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be missing. Fix all these issues by making eVMCS restoration path closer to its 'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'. To avoid ABI incompatibility, do not introduce a new flag and keep the original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in place. To distinguish between 'new' and 'old' formats consider eVMCS GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES path). While technically possible, it seems to be an extremely unlikely case. Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> --- arch/x86/include/uapi/asm/kvm.h | 2 ++ arch/x86/kvm/vmx/nested.c | 27 +++++++++++++++++++++------ 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 0662f644aad9..3845977b739e 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -441,6 +441,8 @@ struct kvm_vmx_nested_state_hdr { __u32 flags; __u64 preemption_timer_deadline; + + __u64 evmcs_pa; }; struct kvm_svm_nested_state_data { diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 37fdc34f7afc..4261cf4755c8 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -6019,6 +6019,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu, .hdr.vmx.vmxon_pa = -1ull, .hdr.vmx.vmcs12_pa = -1ull, .hdr.vmx.preemption_timer_deadline = 0, + .hdr.vmx.evmcs_pa = -1ull, }; struct kvm_vmx_nested_state_data __user *user_vmx_nested_state = &user_kvm_nested_state->data.vmx[0]; @@ -6037,8 +6038,10 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu, if (vmx_has_valid_vmcs12(vcpu)) { kvm_state.size += sizeof(user_vmx_nested_state->vmcs12); - if (vmx->nested.hv_evmcs) + if (vmx->nested.hv_evmcs) { kvm_state.flags |= KVM_STATE_NESTED_EVMCS; + kvm_state.hdr.vmx.evmcs_pa = vmx->nested.hv_evmcs_vmptr; + } if (is_guest_mode(vcpu) && nested_cpu_has_shadow_vmcs(vmcs12) && @@ -6230,13 +6233,25 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa); } else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) { + u64 evmcs_gpa = kvm_state->hdr.vmx.evmcs_pa; + /* - * nested_vmx_handle_enlightened_vmptrld() cannot be called - * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be - * restored yet. EVMCS will be mapped from - * nested_get_vmcs12_pages(). + * EVMCS GPA == 0 most likely indicates that the migration data is + * coming from an older KVM which doesn't support 'evmcs_pa' in + * 'struct kvm_vmx_nested_state_hdr'. */ - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + if (evmcs_gpa && (evmcs_gpa != -1ull) && + (__nested_vmx_handle_enlightened_vmptrld(vcpu, evmcs_gpa, false) != + EVMPTRLD_SUCCEEDED)) { + return -EINVAL; + } else if (!evmcs_gpa) { + /* + * EVMCS GPA can't be acquired from VP assist page here because + * HV_X64_MSR_VP_ASSIST_PAGE may not be restored yet. + * EVMCS will be mapped from nested_get_evmcs_page(). + */ + kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + } } else { return -EINVAL; } -- 2.30.2