Re: [PATCH v4] x86/speculation, KVM: remove IBPB on vCPU load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 12, 2022 at 1:34 PM Jon Kohler <jon@xxxxxxxxxxx> wrote:
>
>
>
> > On May 12, 2022, at 4:27 PM, Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >
> > On Thu, May 12, 2022 at 1:07 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >>
> >> On Thu, May 12, 2022, Jon Kohler wrote:
> >>>
> >>>
> >>>> On May 12, 2022, at 3:35 PM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >>>>
> >>>> On Thu, May 12, 2022, Sean Christopherson wrote:
> >>>>> On Thu, May 12, 2022, Jon Kohler wrote:
> >>>>>> Remove IBPB that is done on KVM vCPU load, as the guest-to-guest
> >>>>>> attack surface is already covered by switch_mm_irqs_off() ->
> >>>>>> cond_mitigation().
> >>>>>>
> >>>>>> The original commit 15d45071523d ("KVM/x86: Add IBPB support") was simply
> >>>>>> wrong in its guest-to-guest design intention. There are three scenarios
> >>>>>> at play here:
> >>>>>
> >>>>> Jim pointed offline that there's a case we didn't consider.  When switching between
> >>>>> vCPUs in the same VM, an IBPB may be warranted as the tasks in the VM may be in
> >>>>> different security domains.  E.g. the guest will not get a notification that vCPU0 is
> >>>>> being swapped out for vCPU1 on a single pCPU.
> >>>>>
> >>>>> So, sadly, after all that, I think the IBPB needs to stay.  But the documentation
> >>>>> most definitely needs to be updated.
> >>>>>
> >>>>> A per-VM capability to skip the IBPB may be warranted, e.g. for container-like
> >>>>> use cases where a single VM is running a single workload.
> >>>>
> >>>> Ah, actually, the IBPB can be skipped if the vCPUs have different mm_structs,
> >>>> because then the IBPB is fully redundant with respect to any IBPB performed by
> >>>> switch_mm_irqs_off().  Hrm, though it might need a KVM or per-VM knob, e.g. just
> >>>> because the VMM doesn't want IBPB doesn't mean the guest doesn't want IBPB.
> >>>>
> >>>> That would also sidestep the largely theoretical question of whether vCPUs from
> >>>> different VMs but the same address space are in the same security domain.  It doesn't
> >>>> matter, because even if they are in the same domain, KVM still needs to do IBPB.
> >>>
> >>> So should we go back to the earlier approach where we have it be only
> >>> IBPB on always_ibpb? Or what?
> >>>
> >>> At minimum, we need to fix the unilateral-ness of all of this :) since we’re
> >>> IBPB’ing even when the user did not explicitly tell us to.
> >>
> >> I think we need separate controls for the guest.  E.g. if the userspace VMM is
> >> sufficiently hardened then it can run without "do IBPB" flag, but that doesn't
> >> mean that the entire guest it's running is sufficiently hardened.
> >>
> >>> That said, since I just re-read the documentation today, it does specifically
> >>> suggest that if the guest wants to protect *itself* it should turn on IBPB or
> >>> STIBP (or other mitigations galore), so I think we end up having to think
> >>> about what our “contract” is with users who host their workloads on
> >>> KVM - are they expecting us to protect them in any/all cases?
> >>>
> >>> Said another way, the internal guest areas of concern aren’t something
> >>> the kernel would always be able to A) identify far in advance and B)
> >>> always solve on the users behalf. There is an argument to be made
> >>> that the guest needs to deal with its own house, yea?
> >>
> >> The issue is that the guest won't get a notification if vCPU0 is replaced with
> >> vCPU1 on the same physical CPU, thus the guest doesn't get an opportunity to emit
> >> IBPB.  Since the host doesn't know whether or not the guest wants )IBPB, unless the
> >> owner of the host is also the owner of the guest workload, the safe approach is to
> >> assume the guest is vulnerable.
> >
> > Exactly. And if the guest has used taskset as its mitigation strategy,
> > how is the host to know?
>
> Yea thats fair enough. I posed a solution on Sean’s response just as this email
> came in, would love to know your thoughts (keying off MSR bitmap).
>

I don't believe this works. The IBPBs in cond_mitigation (static in
arch/x86/mm/tlb.c) won't be triggered if the guest has given its
sensitive tasks exclusive use of their cores. And, if performance is a
concern, that is exactly what someone would do.




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux