On Thu, Mar 03, 2022, Paolo Bonzini wrote: > On 3/3/22 02:43, Sean Christopherson wrote: > > > Maybe I can redirect you to a test case to highlight a possible > > > regression in KVM, as seen by userspace;-) > > Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking > > with unrelated things. The original hack was to fix a userspace bug and should > > never have been mreged. > > Note that it dates back to: > > commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8 > Author: Liran Alon <liran.alon@xxxxxxxxxx> > Date: Fri Sep 14 03:25:52 2018 +0300 > > KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled > Before this commit, KVM exposes MPX VMX controls to L1 guest only based > on if KVM and host processor supports MPX virtualization. > However, these controls should be exposed to guest only in case guest > vCPU supports MPX. > > It's not to fix a userspace bug, it's to support userspace that doesn't > know about using KVM_SET_MSR for VMX features---which is okay since unlike > KVM_SET_CPUID2 it's not a mandatory call. I disagree, IMO failure to properly configure the vCPU model is a userspace bug. Maybe it was a userspace bug induced by a haphazard and/or poorly documented KVM ABI, but it's still a userspace bug. One could argue that KVM should disable/clear VMX features if userspace clears a related CPUID feature, but _setting_ a VMX feature based on CPUID is architecturally wrong. Even if we consider one or both cases to be desirable behavior in terms of creating a consistent vCPU model, forcing a consistent vCPU model for this one case goes against every other ioctl in KVM's ABI. If we consider it KVM's responsibility to propagate CPUID state to VMX MSRs, then KVM has a bunch of "bugs". X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING, SECONDARY_EXEC_TSC_SCALING X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA, VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES