Re: [PATCH v4] x86/speculation, KVM: remove IBPB on vCPU load

Jim Mattson <jmattson@xxxxxxxxxx> · Thu, 12 May 2022 13:06:39 -0700

On Thu, May 12, 2022 at 12:51 PM Jon Kohler <jon@xxxxxxxxxxx> wrote:
>
>
>
> > On May 12, 2022, at 3:35 PM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Thu, May 12, 2022, Sean Christopherson wrote:
> >> On Thu, May 12, 2022, Jon Kohler wrote:
> >>> Remove IBPB that is done on KVM vCPU load, as the guest-to-guest
> >>> attack surface is already covered by switch_mm_irqs_off() ->
> >>> cond_mitigation().
> >>>
> >>> The original commit 15d45071523d ("KVM/x86: Add IBPB support") was simply
> >>> wrong in its guest-to-guest design intention. There are three scenarios
> >>> at play here:
> >>
> >> Jim pointed offline that there's a case we didn't consider.  When switching between
> >> vCPUs in the same VM, an IBPB may be warranted as the tasks in the VM may be in
> >> different security domains.  E.g. the guest will not get a notification that vCPU0 is
> >> being swapped out for vCPU1 on a single pCPU.
> >>
> >> So, sadly, after all that, I think the IBPB needs to stay.  But the documentation
> >> most definitely needs to be updated.
> >>
> >> A per-VM capability to skip the IBPB may be warranted, e.g. for container-like
> >> use cases where a single VM is running a single workload.
> >
> > Ah, actually, the IBPB can be skipped if the vCPUs have different mm_structs,
> > because then the IBPB is fully redundant with respect to any IBPB performed by
> > switch_mm_irqs_off().  Hrm, though it might need a KVM or per-VM knob, e.g. just
> > because the VMM doesn't want IBPB doesn't mean the guest doesn't want IBPB.
> >
> > That would also sidestep the largely theoretical question of whether vCPUs from
> > different VMs but the same address space are in the same security domain.  It doesn't
> > matter, because even if they are in the same domain, KVM still needs to do IBPB.
>
> So should we go back to the earlier approach where we have it be only
> IBPB on always_ibpb? Or what?
>
> At minimum, we need to fix the unilateral-ness of all of this :) since we’re
> IBPB’ing even when the user did not explicitly tell us to.
>
> That said, since I just re-read the documentation today, it does specifically
> suggest that if the guest wants to protect *itself* it should turn on IBPB or
> STIBP (or other mitigations galore), so I think we end up having to think
> about what our “contract” is with users who host their workloads on
> KVM - are they expecting us to protect them in any/all cases?
>
> Said another way, the internal guest areas of concern aren’t something
> the kernel would always be able to A) identify far in advance and B)
> always solve on the users behalf. There is an argument to be made
> that the guest needs to deal with its own house, yea?

To the extent that the guest has control over its own house, yes.

Say the guest obviates the need for internal IBPB by statically
partitioning virtual cores into different security domains. If the
hypervisor breaks core isolation on the physical platform, it is
responsible for providing the necessary mitigations.