Re: [PATCH v3 10/27] KVM: VMX: Set FRED MSR interception

Chao Gao <chao.gao@xxxxxxxxx> · Wed, 13 Nov 2024 19:31:03 +0800

On Mon, Sep 30, 2024 at 10:00:53PM -0700, Xin Li (Intel) wrote:
>From: Xin Li <xin3.li@xxxxxxxxx>
>
>Add FRED MSRs to the VMX passthrough MSR list and set FRED MSRs
>interception.
>
>8 FRED MSRs, i.e., MSR_IA32_FRED_RSP[123], MSR_IA32_FRED_STKLVLS,
>MSR_IA32_FRED_SSP[123] and MSR_IA32_FRED_CONFIG, are all safe to be
>passthrough, because they all have a pair of corresponding host and
>guest VMCS fields.
>
>Both MSR_IA32_FRED_RSP0 and MSR_IA32_FRED_SSP0 are dedicated for user
>level event delivery only, IOW they are NOT used in any kernel event
>delivery and the execution of ERETS.  Thus KVM can run safely with
>guest values in the 2 MSRs.  As a result, save and restore of their
>guest values are postponed until vCPU context switching and their host
>values are restored on returning to userspace.
>
>Save/restore of MSR_IA32_FRED_RSP0 is done in the next patch.
>
>Note, as MSR_IA32_FRED_SSP0 is an alias of MSR_IA32_PL0_SSP, its save
>and restore is done through the CET supervisor context management.

But CET may be not supported by either the host or the guest. How will
MSR_IA32_FRED_SSP0 be switched in this case? I think that's part of the reason
why Sean suggested [*] intercepting the MSR when CET isn't exposed to the
guest.

[*]: https://lore.kernel.org/kvm/ZvQaNRhrsSJTYji3@xxxxxxxxxx/#t

>
>Signed-off-by: Xin Li <xin3.li@xxxxxxxxx>
>Signed-off-by: Xin Li (Intel) <xin@xxxxxxxxx>
>Tested-by: Shan Kang <shan.kang@xxxxxxxxx>
>---
> arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 28cf89c97bda..c10c955722a3 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -176,6 +176,16 @@ static u32 vmx_possible_passthrough_msrs[] = {
> 	MSR_FS_BASE,
> 	MSR_GS_BASE,
> 	MSR_KERNEL_GS_BASE,
>+	MSR_IA32_FRED_RSP0,
>+	MSR_IA32_FRED_RSP1,
>+	MSR_IA32_FRED_RSP2,
>+	MSR_IA32_FRED_RSP3,
>+	MSR_IA32_FRED_STKLVLS,
>+	MSR_IA32_FRED_SSP1,
>+	MSR_IA32_FRED_SSP2,
>+	MSR_IA32_FRED_SSP3,
>+	MSR_IA32_FRED_CONFIG,
>+	MSR_IA32_FRED_SSP0,		/* Should be added through CET */
> 	MSR_IA32_XFD,
> 	MSR_IA32_XFD_ERR,
> #endif
>@@ -7880,6 +7890,28 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
> 		vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
> }
> 
>+static void vmx_set_intercept_for_fred_msr(struct kvm_vcpu *vcpu)
>+{
>+	bool flag = !guest_can_use(vcpu, X86_FEATURE_FRED);
>+
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP1, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP2, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP3, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_STKLVLS, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP1, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP2, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP3, MSR_TYPE_RW, flag);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_CONFIG, MSR_TYPE_RW, flag);
>+
>+	/*
>+	 * flag = !(CET.SUPERVISOR_SHADOW_STACK || FRED)
>+	 *
>+	 * A possible optimization is to intercept SSPs when FRED && !CET.SUPERVISOR_SHADOW_STACK.
>+	 */
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP0, MSR_TYPE_RW, flag);

To implement the "optimization", you can simply remove this line. Then the CET
series will take care of the interception of this MSR. And please leave a
comment here to explain why this MSR is treated differently from other FRED
MSRs.