> On Apr 22, 2022, at 12:21 PM, Jon Kohler <jon@xxxxxxxxxxx> wrote: > > On vmx_vcpu_load_vmcs and svm_vcpu_load, respect user controlled > configuration for conditional IBPB and only attempt IBPB MSR when > switching between different guest vCPUs IFF switch_mm_always_ibpb, > which fixes a situation where the kernel will issue IBPB > unconditionally even when conditional IBPB is enabled. > > If a user has spectre_v2_user mitigation enabled, in any > configuration, and the underlying processor supports X86_FEATURE_IBPB, > X86_FEATURE_USE_IBPB is set and any calls to > indirect_branch_prediction_barrier() will issue IBPB MSR. > > Depending on the spectre_v2_user configuration, either > switch_mm_always_ibpb key or switch_mm_cond_ibpb key will be set. > > Both switch_mm_always_ibpb and switch_mm_cond_ibpb are handled by > switch_mm() -> cond_mitigation(), which works well in cases where > switching vCPUs (i.e. switching tasks) also switches mm_struct; > however, this misses a paranoid case where user space may be running > multiple guests in a single process (i.e. single mm_struct). This > presents two issues: > > Issue 1: > This paranoid case is already covered by vmx_vcpu_load_vmcs and > svm_vcpu_load; however, this is done by calling > indirect_branch_prediction_barrier() and thus the kernel > unconditionally issues IBPB if X86_FEATURE_USE_IBPB is set. > > Issue 2: > For a conditional configuration, this paranoid case is nonsensical. > If userspace runs multiple VMs in the same process, enables cond_ipbp, > _and_ sets TIF_SPEC_IB, then isn't getting full protection in any case, > e.g. if userspace is handling an exit-to-userspace condition for two > vCPUs from different VMs, then the kernel could switch between those > two vCPUs' tasks without bouncing through KVM and thus without doing > KVM's IBPB. > > Fix both by using intermediary call to x86_virt_guest_switch_ibpb(), > which gates IBPB MSR IFF switch_mm_always_ibpb is true. > > switch_mm_cond_ibpb is intentionally ignored from the KVM code side > as it really is nonsensical given the common case is already well > covered by switch_mm(), so issuing an additional IBPB from KVM is > just pure overhead. > > Note: switch_mm_always_ibpb key is user controlled via spectre_v2_user > and will be true for the following configurations: > spectre_v2_user=on > spectre_v2_user=prctl,ibpb > spectre_v2_user=seccomp,ibpb > > Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx> > Cc: Sean Christopherson <seanjc@xxxxxxxxxx> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > Cc: Waiman Long <longman@xxxxxxxxxx> > --- > v1 -> v2: > - Addressed comments on approach from Sean. > v2 -> v3: > - Updated spec-ctrl.h comments and commit msg to incorporate > additional feedback from Sean. > Gentle ping on this one, thanks, Jon > arch/x86/include/asm/spec-ctrl.h | 14 ++++++++++++++ > arch/x86/kernel/cpu/bugs.c | 6 +++++- > arch/x86/kvm/svm/svm.c | 2 +- > arch/x86/kvm/vmx/vmx.c | 2 +- > 4 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h > index 5393babc0598..99d3341d2e21 100644 > --- a/arch/x86/include/asm/spec-ctrl.h > +++ b/arch/x86/include/asm/spec-ctrl.h > @@ -85,4 +85,18 @@ static inline void speculative_store_bypass_ht_init(void) { } > extern void speculation_ctrl_update(unsigned long tif); > extern void speculation_ctrl_update_current(void); > > +/* > + * Issue IBPB when switching guest vCPUs IFF switch_mm_always_ibpb. > + * For the more common case of running VMs in their own dedicated process, > + * switching vCPUs that belong to different VMs, i.e. switching tasks, > + * will also switch mm_structs and thus do IPBP via cond_mitigation(); > + * however, in the always_ibpb case, take a paranoid approach and issue > + * IBPB on both switch_mm() and vCPU switch. > + */ > +static inline void x86_virt_guest_switch_ibpb(void) > +{ > + if (static_branch_unlikely(&switch_mm_always_ibpb)) > + indirect_branch_prediction_barrier(); > +} > + > #endif > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c > index 6296e1ebed1d..6aafb0279cbc 100644 > --- a/arch/x86/kernel/cpu/bugs.c > +++ b/arch/x86/kernel/cpu/bugs.c > @@ -68,8 +68,12 @@ u64 __ro_after_init x86_amd_ls_cfg_ssbd_mask; > DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp); > /* Control conditional IBPB in switch_mm() */ > DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); > -/* Control unconditional IBPB in switch_mm() */ > +/* Control unconditional IBPB in both switch_mm() and > + * x86_virt_guest_switch_ibpb(). > + * See notes on x86_virt_guest_switch_ibpb() for KVM use case details. > + */ > DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb); > +EXPORT_SYMBOL_GPL(switch_mm_always_ibpb); > > /* Control MDS CPU buffer clear before returning to user space */ > DEFINE_STATIC_KEY_FALSE(mds_user_clear); > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index bd4c64b362d2..fc08c94df888 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -1302,7 +1302,7 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > > if (sd->current_vmcb != svm->vmcb) { > sd->current_vmcb = svm->vmcb; > - indirect_branch_prediction_barrier(); > + x86_virt_guest_switch_ibpb(); > } > if (kvm_vcpu_apicv_active(vcpu)) > __avic_vcpu_load(vcpu, cpu); > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 04d170c4b61e..a8eed9b8221b 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1270,7 +1270,7 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, > * The L1 VMM can protect itself with retpolines, IBPB or IBRS. > */ > if (!buddy || WARN_ON_ONCE(buddy->vmcs != prev)) > - indirect_branch_prediction_barrier(); > + x86_virt_guest_switch_ibpb(); > } > > if (!already_loaded) { > -- > 2.30.1 (Apple Git-130) >