Re: [PATCH] KVM: x86: Set BHI_NO in guest when host is not affected by BHI

Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> · Fri, 12 Apr 2024 12:33:45 -0400

On Fri, Apr 12, 2024 at 11:24:54AM +0800, Chao Gao wrote:
> On Thu, Apr 11, 2024 at 04:50:12PM -0400, Konrad Rzeszutek Wilk wrote:
> >On Thu, Apr 11, 2024 at 11:56:39PM +0800, Chao Gao wrote:
> >> On Thu, Apr 11, 2024 at 05:20:30PM +0200, Paolo Bonzini wrote:
> >> >On Thu, Apr 11, 2024 at 5:13 PM Alexandre Chartre
> >> ><alexandre.chartre@xxxxxxxxxx> wrote:
> >> >> I think that Andrew's concern is that if there is no eIBRS on the host then
> >> >> we do not set X86_BUG_BHI on the host because we know the kernel which is
> >> >> running and this kernel has some mitigations (other than the explicit BHI
> >> >> mitigations) and these mitigations are enough to prevent BHI. But still
> >> >> the cpu is affected by BHI.
> >> >
> >> >Hmm, then I'm confused. It's what I wrote before: "The (Linux or
> >> >otherwise) guest will make its own determinations as to whether BHI
> >> >mitigations are necessary. If the guest uses eIBRS, it will run with
> >> >mitigations" but you said machines without eIBRS are fine.
> >> >
> >> >If instead they are only fine _with Linux_, then yeah we cannot set
> >> >BHI_NO in general. What we can do is define a new bit that is in the
> >> >KVM leaves. The new bit is effectively !eIBRS, except that it is
> >> >defined in such a way that, in a mixed migration pool, both eIBRS and
> >> >the new bit will be 0.
> >> 
> >> This looks a good solution.
> >> 
> >> We can also introduce a new bit indicating the effectiveness of the short
> >> BHB-clearing sequence. KVM advertises this bit for all pre-SPR/ADL parts.
> >> Only if the bit is 1, guests will use the short BHB-clearing sequence.
> >> Otherwise guests should use the long sequence. In a mixed migration pool,
> >> the VMM shouldn't expose the bit to guests.
> >
> >Is there a link to this 'short BHB-clearing sequence'?
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-4-4
> 
> >
> >But on your email, should a Skylake guests enable IBRS (or retpoline)
> >and have the short BHB clearing sequence?
> >
> >And IceLake/Cascade lake should use eIBRS (or retpoline) and short BHB
> >clearing sequence?
> >
> >If we already know all of this why does the hypervisor need to advertise
> >this to the guest? They can lookup the CPU data to make this determination, no?
> >
> >I don't actually understand how one could do a mixed migration pool with
> >the various mitigations one has to engage (or not) based on the host one
> >is running under.
> 
> In my understanding, it is done at the cost of performance. The idea is to
> report the "worst" case in a mixed migration pool to guests, i.e.,
> 
>   Hey, you are running on a host where eIBRS is available (and/or the short
>   BHB-clearing sequnece is ineffective). Now, select your mitigation for BHI.

And if the pool could be Skylake,IceLake,CascadeLake,Sapphire Rappids then lowest common one is IBRS.
And you would expose long-clearing BHI sequence enabled for all of them too, me thinks?

I would think that the use case for this mixed migration pool is not
workable anymore unless you expose the Family,Model,Stepping so that the
VM can engage the right mitigation. And then when it is Live Migrated
you re-engage the correct one (So from SkyLake to CascadeLake you kick
turn on eIBRS and BHI. When you move from CascadeLake to Skylake you
turn off BHI and enable retpoline).  But nobody has done that work and
nobody will, so why are we debating this?

> 
> Then no matter which system in the pool the guest is migrated to, the guest is
> not vulnerable if it deployed a mitigation for the "worst" case (in general,
> this means a mitigation with larger overhead).
> 
> The good thing is migration in a mixed pool won't compromise the security level
> of guests and guests in a homogeneous pool won't experience any performance loss.

I understand what you are saying, but I can't actually see someone
wanting to do this as either you get horrible performance (engage all
the mitigation making the VM be 50% slower), or some at runtime (but
nobody has done the work and nobody will as Thomas will not want runtime
knobs for mitigation).

So how about we just try to solve the problem for the 99% of homogenous pools?

And not make Skylake guests slower than they already are?
> 
> >> 
> >> >
> >> >Paolo
> >> >
> >> >