Re: [PATCH] Revert "KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled"

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 22 Jul 2022 15:27:44 +0000

On Fri, Jul 22, 2022, Paolo Bonzini wrote:
> Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
> when guest MPX disabled"), KVM has taken ownership of the "load
> IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls,
> trying to set these bits in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
> MSRs if the guest's CPUID supports MPX, and clear otherwise.
> 
> The intent of the patch was to apply it to L0 in order to work around
> L1 kernels that lack the fix in commit 691bd4340bef ("kvm: vmx: allow
> host to access guest MSR_IA32_BNDCFGS", 2017-07-04): by hiding the
> control bits from L0, L1 hides BNDCFGS from KVM_GET_MSR_INDEX_LIST,
> and the L1 bug is neutralized even in the lack of commit 691bd4340bef.
> 
> This was perhaps a sensible kludge at the time, but a horrible
> idea in the long term and in fact it has not been extended to
> other CPUID bits like these:
> 
>   X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE,
>                     VMX_MISC_SAVE_EFER_LMA
> 
>   X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING,
>                      SECONDARY_EXEC_TSC_SCALING
> 
>   X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID
> 
>   X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING
> 
>   X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA,
>                           VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL
> 
>   X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES
> 
> These days it's sort of common knowledge that any MSR in
> KVM_GET_MSR_INDEX_LIST must allow *at least* setting it with KVM_SET_MSR
> to a default value, so it is unlikely that something like commit
> 5f76f6f5ff96 will be needed again.  So revert it, at the potential cost
> of breaking L1s with a 6 year old kernel.  

I would further qualify this with "breaking L1s with an _unpatched_ 6 year old
kernel".  That fix was tagged for stable and made it way to at least the 4.9 and
4.4 LTS releases.

> While in principle the L0 owner doesn't control what runs on L1, such an old
> hypervisor would probably have many other bugs.

And patching KVM to workaround L1 Linux bugs never ends well, e.g. see also the
hypercall patching snafu.

> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---

Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>