Re: [PATCH] KVM: x86: Set BHI_NO in guest when host is not affected by BHI

Alexandre Chartre <alexandre.chartre@xxxxxxxxxx> · Thu, 11 Apr 2024 17:12:56 +0200

On 4/11/24 16:46, Paolo Bonzini wrote:
On Thu, Apr 11, 2024 at 4:34 PM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:
Still, we could enumerate CPUs which don't have eIBRS independently of NO_BHI
(if we have a list of such CPUs) and set X86_BUG_BHI for cpus with eIBRS.

So in arch/x86/kernel/cpu/common.c, replace:

         /* When virtualized, eIBRS could be hidden, assume vulnerable */
         if (!(ia32_cap & ARCH_CAP_BHI_NO) &&
             !cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
             (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) ||
              boot_cpu_has(X86_FEATURE_HYPERVISOR)))
                 setup_force_cpu_bug(X86_BUG_BHI);

with something like:

         if (!(ia32_cap & ARCH_CAP_BHI_NO) &&
             !cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
             (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) ||
             !cpu_matches(cpu_vuln_whitelist, NO_EIBRS)))
                 setup_force_cpu_bug(X86_BUG_BHI);

No, that you cannot do because the hypervisor can and will fake the
family/model/stepping.

However, looking again at the original patch you submitted, I think
the review was confusing host and guest sides. If the host is "not
affected", i.e. the host *genuinely* does not have eIBRS, it can set
BHI_NO and that will bypass the "always mitigate in a guest" bit.

I think that's robust and boot_cpu_has_bug(X86_BUG_BHI) is the right
way to do it.

If a VM migration pool has both !eIBRS and eIBRS machines, it will
combine the two; on one hand it will not set the eIBRS bit (bit 1), on
the other hand it will not set BHI_NO either, and it will set the
hypervisor bit. The result is that the guest *will* use mitigations.

To double check, from the point of view of a nested hypervisor, you
could set BHI_NO in a nested guest:
* if the nested hypervisor has BHI_NO passed from the outer level
* or if its CPUID passes cpu_matches(cpu_vuln_whitelist, NO_BHI)
* but it won't matter whether the nested hypervisor lacks eIBRS,
because that bit is not reliable in a VM

The logic you'd use in KVM therefore is:

    (ia32_cap & ARCH_CAP_BHI_NO) ||
    cpu_matches(cpu_vuln_whitelist, NO_BHI) ||
    (!boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) &&
     !boot_cpu_has(X86_FEATURE_HYPERVISOR)))

but that is exactly !boot_cpu_has_bug(X86_BUG_BHI) and is therefore
what Alexandre's patch does.

So I'll wait for further comments but I think the patch is correct.

I think that Andrew's concern is that if there is no eIBRS on the host then
we do not set X86_BUG_BHI on the host because we know the kernel which is
running and this kernel has some mitigations (other than the explicit BHI
mitigations) and these mitigations are enough to prevent BHI. But still
the cpu is affected by BHI.

If you set ARCH_CAP_BHI_NO for the guest then you tell the guest that the
cpu is not affected by BHI although it is. The guest can be running a
different kernel or OS which doesn't necessarily have the (basic) mitigations
used in the host kernel that mitigate BHI.

alex.