Re: [PATCH v3] x86/speculation, KVM: only IBPB for switch_mm_always_ibpb on vCPU load

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 29 Apr 2022 21:59:52 +0000

On Fri, Apr 29, 2022, Borislav Petkov wrote:
> On Fri, Apr 29, 2022 at 08:29:30PM +0000, Sean Christopherson wrote:
> > That's why there's a bunch of hand-waving.
> 
> Well, I'm still not sure what this patch is trying to fix but both your
> latest replies do sound clearer...
> 
> > Can you clarify what "this" is?  Does "this" mean "this patch", or does it mean
> 
> This patch.
> 
> > "this IBPB when switching vCPUs"?  Because if it means the latter, then I think
> > you're in violent agreement; the IBPB when switching vCPUs is pointless and
> > unnecessary.
> 
> Ok, let's concentrate on the bug first - whether a second IBPB - so to
> speak - is needed. Doing some git archeology points to:
> 
>   15d45071523d ("KVM/x86: Add IBPB support")
> 
> which - and I'm surprised - goes to great lengths to explain what
> those IBPB calls in KVM protect against. From that commit message, for
> example:
> 
> "    * Mitigate attacks from guest/ring3->host/ring3.
>       These would require a IBPB during context switch in host, or after
>       VMEXIT."

Except that snippet changelog doesn't actually state what KVM does, it states what
a hypervsior _could_ do to protect the host from the guest via IBPB.

> so with my very limited virt understanding, when you vmexit, you don't
> do switch_mm(), right?

Correct, but KVM also doesn't do IBPB on VM-Exit (or VM-Entry), nor does KVM do
IBPB before exiting to userspace.  The IBPB we want to whack is issued only when
KVM is switching vCPUs.

> If so, you need to do a barrier. Regardless of conditional IBPB or not
> as you want to protect the host from a malicious guest.
> 
> In general, the whole mitigation strategies are enumerated in
> 
> Documentation/admin-guide/hw-vuln/spectre.rst
> 
> There's also a "3. VM mitigation" section.
> 
> And so on...
> 
> Bottomline is this: at the time, we went to great lengths to document
> what the attacks are and how we are protecting against them.

Except that _none_ of that documentation explains why the hell KVM does IBPB when
switching betwen vCPUs.  The only item is this snippet from the changelog:

    * Mitigate guests from being attacked by other guests.
      - This is addressed by issing IBPB when we do a guest switch.

And that's the one that I pointed out in v1 as being flawed/wrong, and how Jon
ended up with this patch.

  : But stepping back, why does KVM do its own IBPB in the first place?  The goal is
  : to prevent one vCPU from attacking the next vCPU run on the same pCPU.  But unless
  : userspace is running multiple VMs in the same process/mm_struct, switching vCPUs,
  : i.e. switching tasks, will also switch mm_structs and thus do IPBP via cond_mitigation.
  :
  : If userspace runs multiple VMs in the same process, enables cond_ipbp, _and_ sets
  : TIF_SPEC_IB, then it's being stupid and isn't getting full protection in any case,
  : e.g. if userspace is handling an exit-to-userspace condition for two vCPUs from
  : different VMs, then the kernel could switch between those two vCPUs' tasks without
  : bouncing through KVM and thus without doing KVM's IBPB.
  :
  : I can kinda see doing this for always_ibpb, e.g. if userspace is unaware of spectre
  : and is naively running multiple VMs in the same process.
  :
  : What am I missing?