On Thu, Oct 06, 2022, Jim Mattson wrote: > On Wed, Sep 28, 2022 at 4:53 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > > > According to Intel's document on Indirect Branch Restricted > > Speculation, "Enabling IBRS does not prevent software from controlling > > the predicted targets of indirect branches of unrelated software > > executed later at the same predictor mode (for example, between two > > different user applications, or two different virtual machines). Such > > isolation can be ensured through use of the Indirect Branch Predictor > > Barrier (IBPB) command." This applies to both basic and enhanced IBRS. > > > > Since L1 and L2 VMs share hardware predictor modes (guest-user and > > guest-kernel), hardware IBRS is not sufficient to virtualize > > IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts, > > hardware IBRS is actually sufficient in practice, even though it isn't > > sufficient architecturally.) > > > > For virtual CPUs that support IBRS, add an indirect branch prediction > > barrier on emulated VM-exit, to ensure that the predicted targets of > > indirect branches executed in L1 cannot be controlled by software that > > was executed in L2. > > > > Since we typically don't intercept guest writes to IA32_SPEC_CTRL, > > perform the IBPB at emulated VM-exit regardless of the current > > IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be > > deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is > > clear at emulated VM-exit. > > > > This is CVE-2022-2196. > > > > Fixes: 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02") > > Cc: Sean Christopherson <seanjc@xxxxxxxxxx> > > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> Code looks good, just some whining about the comments. > > --- > > arch/x86/kvm/vmx/nested.c | 8 ++++++++ > > arch/x86/kvm/vmx/vmx.c | 8 +++++--- > > 2 files changed, 13 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index ddd4367d4826..87993951fe47 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -4604,6 +4604,14 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, > > > > vmx_switch_vmcs(vcpu, &vmx->vmcs01); > > > > + /* > > + * For virtual IBRS, The "virtual" part confused me (even more so than usual for all this mitigation stuff). At first I thought "virtual" was referring to some specialized hardware. > we have to flush the indirect branch Avoid pronouns please. > > + * predictors, since L1 and L2 share hardware predictor > > + * modes. Wrap at 80 chars, this easily fits on two lines. Though I'd prefer to make this comment much longer, I have a terrible time keeping track of this stuff (obviously). Is this accurate? /* * If IBRS is advertised to the guest, KVM must flush the indirect * branch predictors when transitioning from L2 to L1, as L1 expects * hardware (KVM in this case) to provide separate predictor modes. * Bare metal isolates VMX root (host) from VMX non-root (guest), but * doesn't isolate different VMs, i.e. in this case, doesn't provide * separate modes for L2 vs L1. */ > > + */ > > + if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) > > + indirect_branch_prediction_barrier(); > > + > > /* Update any VMCS fields that might have changed while L2 ran */ > > vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); > > vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index ffe552a82044..bf8b5c9c56ae 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -1347,9 +1347,11 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, > > vmcs_load(vmx->loaded_vmcs->vmcs); > > > > /* > > - * No indirect branch prediction barrier needed when switching > > - * the active VMCS within a guest, e.g. on nested VM-Enter. > > - * The L1 VMM can protect itself with retpolines, IBPB or IBRS. > > + * No indirect branch prediction barrier needed when > > + * switching the active VMCS within a guest, except Wrap at 80. > > + * for virtual IBRS. To minimize the number of IBPBs Same nit on "virtual IBRS". > > + * executed, the one to support virtual IBRS is > > + * handled specially in nested_vmx_vmexit(). "nested VM-Exit" instead of nested_vmx_vmexit() to avoid rename hell, though that one is unlikely to get renamed. This work? /* * No indirect branch prediction barrier needed when switching * the active VMCS within a guest, except if IBRS is advertised * to the guest. To minimize the number of IBPBs executed, KVM * performs IBPB on nested VM-Exit (a single nested transition * may switch the active VMCS multiple times). */ > > */ > > if (!buddy || WARN_ON_ONCE(buddy->vmcs != prev)) > > indirect_branch_prediction_barrier(); > > -- > > 2.37.3.998.g577e59143f-goog > > > Ping.