On Wed, 2021-07-21 at 16:18 +0000, Sean Christopherson wrote: > On Wed, Jul 21, 2021, Hu, Robert wrote: > > > > Queued, thanks. Without having checked the kvm-unit-tests > > > > sources > > > > very thoroughly, this might be a configuration issue in > > > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu > > > > host,migratable=no") should not enable TSC scaling. > > > > > > As noted in the code comments, KVM allows VMREAD/VMWRITE to all > > > defined > > > fields, whether or not the field should actually exist for the > > > vCPU model doesn't > > > enter into the equation. That's technically wrong as there are a > > > number of > > > fields that the SDM explicitly states exist iff a certain feature > > > is supported. > > > > It's right that a number of fields' existence depends on some > > certain feature. > > Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest > > index > > value used in the encoding of any field *supported* by the > > processor", rather than > > *existed*. > > Yes. > > > So my understanding is no matter what VMCS exec control field's > > value is set, > > value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports > > the physical > > CPU's capability rather than runtime VMCS configuration. > > Yes. > > > Back to nested case, L1's VMCS configuration lays the "physical" > > capability > > for L2, right? > > Yes. > > > nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put > > to vcpu > > scope, rather than current kvm global. > > > > Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, > > before > > vcpu features are settled. I think should be moved to later stage > > as well. > > Just moving the helper (or adding another call) wouldn't fix the > underlying > problem that KVM doesn't correctly model the interplay between VMX > features and > VMCS fields. It's arguably less wrong than letting userspace stuff > an incorrect > value, but it's not 100% correct and ignoring/overriding userspace is > sketchy at > best. I think IA32_VMX_VMCS_ENUM MSR shall not be set by QEMU, it is actually already indirectly set by user space CPU feature set. > As suggested below, the full fix is to fail VMREAD/VMWRITE to fields > that > shouldn't exist given the vCPU model. > > > > To fix that we'd need to add a "feature flag" to > > > vmcs_field_to_offset_table > > > that is checked against the vCPU model, though updating the MSR > > > would > > > probably fall onto userspace's shoulders? > > > > > > And FWIW, this is the QEMU code: > > > > > > #define VMCS12_MAX_FIELD_INDEX (0x17) > > > > > > static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray > > > f) > > > { > > > ... > > > > > > /* > > > * Just to be safe, write these with constant values. The > > > CRn_FIXED1 > > > * MSRs are generated by KVM based on the vCPU's CPUID. > > > */ > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0, > > > CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK); > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0, > > > CR4_VMXE_MASK); > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM, > > > VMCS12_MAX_FIELD_INDEX << 1); > > > }