When running cloud-hypervisor tests, VM entry into an L2 guest on KVM on Hyper-V fails with this splat (stripped for brevity): [ 1481.600386] WARNING: CPU: 4 PID: 7641 at arch/x86/kvm/vmx/nested.c:4563 nested_vmx_vmexit+0x70d/0x790 [kvm_intel] [ 1481.600427] CPU: 4 PID: 7641 Comm: vcpu2 Not tainted 5.15.0-1008-azure #9-Ubuntu [ 1481.600429] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 07/22/2021 [ 1481.600430] RIP: 0010:nested_vmx_vmexit+0x70d/0x790 [kvm_intel] [ 1481.600447] Call Trace: [ 1481.600449] <TASK> [ 1481.600451] nested_vmx_reflect_vmexit+0x10b/0x440 [kvm_intel] [ 1481.600457] __vmx_handle_exit+0xef/0x670 [kvm_intel] [ 1481.600467] vmx_handle_exit+0x12/0x50 [kvm_intel] [ 1481.600472] vcpu_enter_guest+0x83a/0xfd0 [kvm] [ 1481.600524] vcpu_run+0x5e/0x240 [kvm] [ 1481.600560] kvm_arch_vcpu_ioctl_run+0xd7/0x550 [kvm] [ 1481.600597] kvm_vcpu_ioctl+0x29a/0x6d0 [kvm] [ 1481.600634] __x64_sys_ioctl+0x91/0xc0 [ 1481.600637] do_syscall_64+0x5c/0xc0 [ 1481.600667] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1481.600670] RIP: 0033:0x7f688becdaff [ 1481.600686] </TASK> TSC multiplier field is currently not supported in EVMCS in KVM. It was previously not supported from Hyper-V but has been added since. Because it is not supported in KVM the use "TSC scaling control" is filtered out of vmcs_config by evmcs_sanitize_exec_ctrls(). However, in nested_vmx_setup_ctls_msrs(), TSC scaling is exposed to L1. eVMCS unsupported fields are not sanitized. When L1 tries to launch an L2 guest, vmcs12 has TSC scaling enabled. This propagates to vmcs02. But KVM doesn't set the TSC multiplier value because kvm_has_tsc_control is false. Due to this VM entry for L2 guest fails. (VM entry fails if "use TSC scaling" is 1 but TSC multiplier is 0.) To fix, in nested_vmx_setup_ctls_msrs(), sanitize the values read from MSRs by filtering out fields that are not supported by eVMCS. This is a stable-friendly intermediate fix. A more comprehensive fix is in progress [1] but is probably too complicated to safely apply to stable. [1]: https://lore.kernel.org/kvm/20220627160440.31857-1-vkuznets@xxxxxxxxxx/ Fixes: d041b5ea93352 ("KVM: nVMX: Enable nested TSC scaling") Signed-off-by: Anirudh Rayabharam <anrayabh@xxxxxxxxxxxxxxxxxxx> --- Changes since v1: - Sanitize all eVMCS unsupported fields instead of just TSC scaling. v1: https://lore.kernel.org/lkml/20220613161611.3567556-1-anrayabh@xxxxxxxxxxxxxxxxxxx/ --- arch/x86/kvm/vmx/nested.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index f5cb18e00e78..f88d748c7cc6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -6564,6 +6564,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->pinbased_ctls_high); msrs->pinbased_ctls_low |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->pinbased_ctls_high &= ~EVMCS1_UNSUPPORTED_PINCTRL; +#endif msrs->pinbased_ctls_high &= PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING | @@ -6580,6 +6584,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->exit_ctls_low = VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->exit_ctls_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL; +#endif msrs->exit_ctls_high &= #ifdef CONFIG_X86_64 VM_EXIT_HOST_ADDR_SPACE_SIZE | @@ -6600,6 +6608,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->entry_ctls_high); msrs->entry_ctls_low = VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->entry_ctls_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL; +#endif msrs->entry_ctls_high &= #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | @@ -6657,6 +6669,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->secondary_ctls_high); msrs->secondary_ctls_low = 0; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->secondary_ctls_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC; +#endif msrs->secondary_ctls_high &= SECONDARY_EXEC_DESC | SECONDARY_EXEC_ENABLE_RDTSCP | -- 2.34.1