On Wed, Apr 10, 2024, Chao Gao wrote: > From: Zhang Chen <chen.zhang@xxxxxxxxx> > > Allow guest to report if the short BHB-clearing sequence is in use. > > KVM will deploy BHI_DIS_S for the guest if the short BHB-clearing > sequence is in use and the processor doesn't enumerate BHI_NO. > > Signed-off-by: Zhang Chen <chen.zhang@xxxxxxxxx> > Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> > --- > arch/x86/kvm/vmx/vmx.c | 31 ++++++++++++++++++++++++++++--- > 1 file changed, 28 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index cc260b14f8df..c5ceaebd954b 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1956,8 +1956,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx, > } > > #define VIRTUAL_ENUMERATION_VALID_BITS VIRT_ENUM_MITIGATION_CTRL_SUPPORT > -#define MITI_ENUM_VALID_BITS 0ULL > -#define MITI_CTRL_VALID_BITS 0ULL > +#define MITI_ENUM_VALID_BITS MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT > +#define MITI_CTRL_VALID_BITS MITI_CTRL_BHB_CLEAR_SEQ_S_USED > > static int vmx_get_msr_feature(struct kvm_msr_entry *msr) > { > @@ -2204,7 +2204,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > struct vmx_uret_msr *msr; > int ret = 0; > u32 msr_index = msr_info->index; > - u64 data = msr_info->data; > + u64 data = msr_info->data, spec_ctrl_mask = 0; > u32 index; > > switch (msr_index) { > @@ -2508,6 +2508,31 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > if (data & ~MITI_CTRL_VALID_BITS) > return 1; > > + if (data & MITI_CTRL_BHB_CLEAR_SEQ_S_USED && > + kvm_cpu_cap_has(X86_FEATURE_BHI_CTRL) && > + !(host_arch_capabilities & ARCH_CAP_BHI_NO)) > + spec_ctrl_mask |= SPEC_CTRL_BHI_DIS_S; > + > + /* > + * Intercept IA32_SPEC_CTRL to disallow guest from changing > + * certain bits if "virtualize IA32_SPEC_CTRL" isn't supported > + * e.g., in nested case. > + */ > + if (spec_ctrl_mask && !cpu_has_spec_ctrl_shadow()) > + vmx_enable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); > + > + /* > + * KVM_CAP_FORCE_SPEC_CTRL takes precedence over > + * MSR_VIRTUAL_MITIGATION_CTRL. > + */ > + spec_ctrl_mask &= ~vmx->vcpu.kvm->arch.force_spec_ctrl_mask; > + > + vmx->force_spec_ctrl_mask = vmx->vcpu.kvm->arch.force_spec_ctrl_mask | > + spec_ctrl_mask; > + vmx->force_spec_ctrl_value = vmx->vcpu.kvm->arch.force_spec_ctrl_value | > + spec_ctrl_mask; > + vmx_set_spec_ctrl(&vmx->vcpu, vmx->spec_ctrl_shadow); > + > vmx->msr_virtual_mitigation_ctrl = data; > break; I continue find all of this unpalatable. The guest tells KVM what software mitigations the guest is using, and then KVM is supposed to translate that into some hardware functionality? And merge that with userspace's own overrides? Blech. With KVM_CAP_FORCE_SPEC_CTRL, I don't see any reason for KVM to support the Intel-defined virtual MSRs. If the userspace VMM wants to play nice with the Intel-defined stuff, then userspace can advertise the MSRs and use an MSR filter to intercept and "emulate" the MSRs. They should be set-and-forget MSRs, so there's no need for KVM to handle them for performance reasons. That way KVM doesn't need to deal with the the virtual MSRs, userspace can make an informed decision when deciding how to set KVM_CAP_FORCE_SPEC_CTRL, and as a bonus, rollouts for new mitigation thingies should be faster as updating userspace is typically easier than updating the kernel/KVM.