On Sat, Feb 15, 2025 at 01:53:07PM +0100, Borislav Petkov wrote: > On Fri, Feb 14, 2025 at 09:10:05PM +0100, Borislav Petkov wrote: > > After talking with folks internally, you're probably right. We should slap an > > IBPB before clearing. Which means, I cannot use the MSR return slots anymore. > > I will have to resurrect some of the other solutions we had lined up... > > So I'm thinking about this (totally untested ofc) but I'm doing it in the CLGI > region so no need to worry about migration etc. > > Sean, that ok this way? I am no Sean, but I have some questions about this approach if that's okay :) First of all, the use of indirect_branch_prediction_barrier() is interesting to me because it only actually performs an IBPB if X86_FEATURE_USE_IBPB is set, which is set according to the spectre_v2 mitigation. It seems to me that indirect_branch_prediction_barrier() was originally intended for use only in switch_mm_irqs_off() -> cond_mitigation(), where the spectre_v2 mitigations are executed, then made its way to other places like KVM. Second, and probably more importantly, it seems like with this approach we will essentially be doing an IBPB on every VM-exit AND running the guest with reduced speculation. At least on the surface, this looks worse than an IBPB on VM-exit. My understanding is that this MSR is intended to be a more efficient mitigation than IBPB on VM-exit. This probably performs considerably worse than the previous approaches, so I am wondering which approach is the 1-2% regression figure associated with. If 1-2% is the cost for keeping the MSR enabled at all times, I wonder if we should just do that for simplicitly, and have it its own mitigation option (chosen by the cmdline). Alternatively, if that's too expensive, perhaps we can choose another boundary to clear the MSR at and perform an IBPB. I can think of two places: - Upon return to userspace (similar to your previous proposal). In this case we run userspace with the MSR cleared, and only perform an IBPB in the exit to userspace pace. - In the switch_mm() path (around cond_mitigation()). Basically we keep the MSR bit set until we go to a different process, at which point we clear it and do an IBPB. The MSR will bet set while the VMM is running, but other processes in the system won't take the hit. We can even be smarter and only do the MSR clear + IBPB if we're going from a process using KVM to process that isn't. We'll need more bookkeeping for that though. Any thoughts? Am I completely off base? > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 6ea3632af580..dcc4e5935b82 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -4272,8 +4272,16 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, > if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL)) > x86_spec_ctrl_set_guest(svm->virt_spec_ctrl); > > + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) > + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > + > svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted); > > + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) { > + indirect_branch_prediction_barrier(); > + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > + } > + > if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL)) > x86_spec_ctrl_restore_host(svm->virt_spec_ctrl); > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette