Re: [PATCH 2/2] KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 6, 2022 at 5:35 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Oct 06, 2022, Jim Mattson wrote:
> > On Wed, Sep 28, 2022 at 4:53 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> > >
> > > According to Intel's document on Indirect Branch Restricted
> > > Speculation, "Enabling IBRS does not prevent software from controlling
> > > the predicted targets of indirect branches of unrelated software
> > > executed later at the same predictor mode (for example, between two
> > > different user applications, or two different virtual machines). Such
> > > isolation can be ensured through use of the Indirect Branch Predictor
> > > Barrier (IBPB) command." This applies to both basic and enhanced IBRS.
> > >
> > > Since L1 and L2 VMs share hardware predictor modes (guest-user and
> > > guest-kernel), hardware IBRS is not sufficient to virtualize
> > > IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts,
> > > hardware IBRS is actually sufficient in practice, even though it isn't
> > > sufficient architecturally.)
> > >
> > > For virtual CPUs that support IBRS, add an indirect branch prediction
> > > barrier on emulated VM-exit, to ensure that the predicted targets of
> > > indirect branches executed in L1 cannot be controlled by software that
> > > was executed in L2.
> > >
> > > Since we typically don't intercept guest writes to IA32_SPEC_CTRL,
> > > perform the IBPB at emulated VM-exit regardless of the current
> > > IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be
> > > deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is
> > > clear at emulated VM-exit.
> > >
> > > This is CVE-2022-2196.
> > >
> > > Fixes: 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02")
> > > Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
>
> Code looks good, just some whining about the comments.
>
> > > ---
> > >  arch/x86/kvm/vmx/nested.c | 8 ++++++++
> > >  arch/x86/kvm/vmx/vmx.c    | 8 +++++---
> > >  2 files changed, 13 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > index ddd4367d4826..87993951fe47 100644
> > > --- a/arch/x86/kvm/vmx/nested.c
> > > +++ b/arch/x86/kvm/vmx/nested.c
> > > @@ -4604,6 +4604,14 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
> > >
> > >         vmx_switch_vmcs(vcpu, &vmx->vmcs01);
> > >
> > > +       /*
> > > +        * For virtual IBRS,
>
> The "virtual" part confused me (even more so than usual for all this mitigation
> stuff).  At first I thought "virtual" was referring to some specialized hardware.
>
> > we have to flush the indirect branch
>
> Avoid pronouns please.
>
> > > +        * predictors, since L1 and L2 share hardware predictor
> > > +        * modes.
>
> Wrap at 80 chars, this easily fits on two lines.  Though I'd prefer to make this
> comment much longer, I have a terrible time keeping track of this stuff (obviously).
>
> Is this accurate?
>
>         /*
>          * If IBRS is advertised to the guest, KVM must flush the indirect

Advertising is done on a per-vCPU basis, so "advertised to the vCPU"?

>          * branch predictors when transitioning from L2 to L1, as L1 expects
>          * hardware (KVM in this case) to provide separate predictor modes.
>          * Bare metal isolates VMX root (host) from VMX non-root (guest), but
>          * doesn't isolate different VMs, i.e. in this case, doesn't provide

"different VMCSs"?

>          * separate modes for L2 vs L1.
>          */
>
> > > +        */
> > > +       if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > > +               indirect_branch_prediction_barrier();
> > > +
> > >         /* Update any VMCS fields that might have changed while L2 ran */
> > >         vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> > >         vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index ffe552a82044..bf8b5c9c56ae 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -1347,9 +1347,11 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
> > >                 vmcs_load(vmx->loaded_vmcs->vmcs);
> > >
> > >                 /*
> > > -                * No indirect branch prediction barrier needed when switching
> > > -                * the active VMCS within a guest, e.g. on nested VM-Enter.
> > > -                * The L1 VMM can protect itself with retpolines, IBPB or IBRS.
> > > +                * No indirect branch prediction barrier needed when
> > > +                * switching the active VMCS within a guest, except
>
> Wrap at 80.
>
> > > +                * for virtual IBRS. To minimize the number of IBPBs
>
> Same nit on "virtual IBRS".
>
> > > +                * executed, the one to support virtual IBRS is
> > > +                * handled specially in nested_vmx_vmexit().
>
> "nested VM-Exit" instead of nested_vmx_vmexit() to avoid rename hell, though that
> one is unlikely to get renamed.
>
> This work?
>
>                 /*
>                  * No indirect branch prediction barrier needed when switching
>                  * the active VMCS within a guest, except if IBRS is advertised

"within a vCPU"?

>                  * to the guest.  To minimize the number of IBPBs executed, KVM
>                  * performs IBPB on nested VM-Exit (a single nested transition
>                  * may switch the active VMCS multiple times).
>                  */
>

Should I send v2, or can this all be handled when applying?



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux