On hardware that supports BHI_DIS_S/X86_FEATURE_BHI_CTRL, do not use hardware mitigation when using BHI_MITIGATION_VMEXIT_ONLY, as this causes the value of MSR_IA32_SPEC_CTRL to change, which inflicts additional KVM overhead. Example: In a typical eIBRS enabled system, such as Intel SPR, the SPEC_CTRL may be commonly set to val == 1 to reflect eIBRS enablement; however, SPEC_CTRL_BHI_DIS_S causes val == 1025. If the guests that KVM is virtualizing do not also set the guest side value == 1025, KVM will constantly have to wrmsr toggle the guest vs host value on both entry and exit, delaying both. Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx> --- arch/x86/kernel/cpu/bugs.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 45675da354f3..df7535f5e882 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1662,8 +1662,16 @@ static void __init bhi_select_mitigation(void) return; } - /* Mitigate in hardware if supported */ - if (spec_ctrl_bhi_dis()) + /* + * Mitigate in hardware if appropriate. + * Note: for vmexit only, do not mitigate in hardware to avoid changing + * the value of MSR_IA32_SPEC_CTRL to include SPEC_CTRL_BHI_DIS_S. If a + * guest does not also set their own SPEC_CTRL to include this, KVM has + * to toggle on every vmexit and vmentry if the host value does not + * match the guest value. Instead, depend on software loop mitigation + * only. + */ + if (bhi_mitigation != BHI_MITIGATION_VMEXIT_ONLY && spec_ctrl_bhi_dis()) return; if (!IS_ENABLED(CONFIG_X86_64)) -- 2.43.0