On Thu, Mar 3, 2022 at 8:15 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Mar 03, 2022, Paolo Bonzini wrote: > > On 3/3/22 02:43, Sean Christopherson wrote: > > > > Maybe I can redirect you to a test case to highlight a possible > > > > regression in KVM, as seen by userspace;-) > > > Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking > > > with unrelated things. The original hack was to fix a userspace bug and should > > > never have been mreged. > > > > Note that it dates back to: > > > > commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8 > > Author: Liran Alon <liran.alon@xxxxxxxxxx> > > Date: Fri Sep 14 03:25:52 2018 +0300 > > > > KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled > > Before this commit, KVM exposes MPX VMX controls to L1 guest only based > > on if KVM and host processor supports MPX virtualization. > > However, these controls should be exposed to guest only in case guest > > vCPU supports MPX. > > > > It's not to fix a userspace bug, it's to support userspace that doesn't > > know about using KVM_SET_MSR for VMX features---which is okay since unlike > > KVM_SET_CPUID2 it's not a mandatory call. > > I disagree, IMO failure to properly configure the vCPU model is a userspace bug. > Maybe it was a userspace bug induced by a haphazard and/or poorly documented KVM > ABI, but it's still a userspace bug. One could argue that KVM should disable/clear > VMX features if userspace clears a related CPUID feature, but _setting_ a VMX > feature based on CPUID is architecturally wrong. Even if we consider one or both > cases to be desirable behavior in terms of creating a consistent vCPU model, forcing > a consistent vCPU model for this one case goes against every other ioctl in KVM's > ABI. > > If we consider it KVM's responsibility to propagate CPUID state to VMX MSRs, then > KVM has a bunch of "bugs". > > X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA > > X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING, > SECONDARY_EXEC_TSC_SCALING > > X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID > > X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING > > X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA, > VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL > > X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES I don't disagree with you, but this does beg the question, "What's going on with all of the invocations of cr4_fixed1_update()?"