Sean Christopherson <seanjc@xxxxxxxxxx> writes: > On Mon, Aug 22, 2022, Vitaly Kuznetsov wrote: >> Sean Christopherson <seanjc@xxxxxxxxxx> writes: >> >> > On Thu, Aug 18, 2022, Vitaly Kuznetsov wrote: >> >> Sean Christopherson <seanjc@xxxxxxxxxx> writes: >> >> >> >> > On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote: >> >> >> + * Note: HV_X64_NESTED_EVMCS1_2022_UPDATE is not currently documented in any >> >> >> + * published TLFS version. When the bit is set, nested hypervisor can use >> >> >> + * 'updated' eVMCSv1 specification (perf_global_ctrl, s_cet, ssp, lbr_ctl, >> >> >> + * encls_exiting_bitmap, tsc_multiplier fields which were missing in 2016 >> >> >> + * specification). >> >> >> + */ >> >> >> +#define HV_X64_NESTED_EVMCS1_2022_UPDATE BIT(0) >> >> > >> >> > This bit is now defined[*], but the docs says it's only for perf_global_ctrl. Are >> >> > we expecting an update to the TLFS? >> >> > >> >> > Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl fields >> >> > in the enlightened VMCS. >> >> > >> >> > [*] https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#hypervisor-nested-virtualization-features---0x4000000a >> >> > >> >> >> >> Oh well, better this than nothing. I'll ping the people who told me >> >> about this bit that their description is incomplete. >> > >> > Not that it changes anything, but I'd rather have no documentation. I'd much rather >> > KVM say "this is the undocumented behavior" than "the document behavior is wrong". >> > >> >> So I reached out to Microsoft and their answer was that for all these new >> eVMCS fields (including *PerfGlobalCtrl) observing architectural VMX >> MSRs should be enough. *PerfGlobalCtrl case is special because of Win11 >> bug (if we expose the feature in VMX feature MSRs but don't set >> CPUID.0x4000000A.EBX BIT(0) it just doesn't boot). > > I.e. TSC_SCALING shouldn't be gated on the flag? If so, then the 2-D array approach > is overkill since (a) the CPUID flag only controls PERF_GLOBAL_CTRL and (b) we aren't > expecting any more flags in the future. > Unfortunately, we have to gate the presence of these new features on something, otherwise VMM has no way to specify which particular eVMCS "revision" it wants (TL;DR: we will break migration). My initial implementation was inventing 'eVMCS revision' concept: https://lore.kernel.org/kvm/20220629150625.238286-7-vkuznets@xxxxxxxxxx/ which is needed if we don't gate all these new fields on CPUID.0x4000000A.EBX BIT(0). Going forward, we will still (likely) need something when new fields show up. > What about this for an implementation? > > static bool evmcs_has_perf_global_ctrl(struct kvm_vcpu *vcpu) > { > struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu); > > /* > * Filtering VMX controls for eVMCS compatibility should only be done > * for guest accesses, and all such accesses should be gated on Hyper-V > * being enabled and initialized. > */ > if (WARN_ON_ONCE(!hv_vcpu)) > return false; > > return hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL; > } > > static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu, u32 msr_index) > { > u32 unsupported_ctrls; > > switch (msr_index) { > case MSR_IA32_VMX_EXIT_CTLS: > case MSR_IA32_VMX_TRUE_EXIT_CTLS: > unsupported_ctrls = EVMCS1_UNSUPPORTED_VMEXIT_CTRL; > if (!evmcs_has_perf_global_ctrl(vcpu)) > unsupported_ctrls |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL; > return unsupported_ctrls; > case MSR_IA32_VMX_ENTRY_CTLS: > case MSR_IA32_VMX_TRUE_ENTRY_CTLS: > unsupported_ctrls = EVMCS1_UNSUPPORTED_VMENTRY_CTRL; > if (!evmcs_has_perf_global_ctrl(vcpu)) > unsupported_ctrls |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; > return unsupported_ctrls; > case MSR_IA32_VMX_PROCBASED_CTLS2: > return EVMCS1_UNSUPPORTED_2NDEXEC; > case MSR_IA32_VMX_TRUE_PINBASED_CTLS: > case MSR_IA32_VMX_PINBASED_CTLS: > return EVMCS1_UNSUPPORTED_PINCTRL; > case MSR_IA32_VMX_VMFUNC: > return EVMCS1_UNSUPPORTED_VMFUNC; > default: > KVM_BUG_ON(1, vcpu->kvm); > return 0; > } > } > > void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > { > u64 unsupported_ctrls = evmcs_get_unsupported_ctls(vcpu, msr_index); > > if (msr_index == MSR_IA32_VMX_VMFUNC) > *pdata &= ~unsupported_ctrls; > else > *pdata &= ~(unsupported_ctrls << 32); > } > It's smaller and I like it but it would only work in conjunction with KVM_CAP_HYPERV_ENLIGHTENED_VMCS2... > >> What I'm still concerned about is future proofing KVM for new >> features. When something is getting added to KVM for which no eVMCS >> field is currently defined, both Hyper-V-on-KVM and KVM-on-Hyper-V cases >> should be taken care of. It would probably be better to reverse our >> filtering, explicitly listing features supported in eVMCS. The lists are >> going to be fairly long but at least we won't have to take care of any >> new architectural feature added to KVM. > > Having the filtering be opt-in crossed my mind as well. Reversing the filtering > can be done after this series though, correct? > Yes, that's my plan, Get this in to fix the immediate issue with 2022 features and probably reverse the filtering before Microsoft releases something else :-) -- Vitaly