On Wed, Feb 07, 2024, Xin Li wrote: > Switch MSR_IA32_FRED_RSP0 between host and guest in > vmx_prepare_switch_to_{host,guest}(). > > MSR_IA32_FRED_RSP0 is used during ring 3 event delivery only, thus > KVM, running on ring 0, can run safely with guest FRED RSP0, i.e., > no need to switch between host/guest FRED RSP0 during VM entry and > exit. > > KVM should switch to host FRED RSP0 before returning to user level, > and switch to guest FRED RSP0 before entering guest mode. Heh, if only KVM had a framework that was specifically designed for context switching MSRs on return to userspace. Translation: please use the user_return_msr() APIs. > Signed-off-by: Xin Li <xin3.li@xxxxxxxxx> > Tested-by: Shan Kang <shan.kang@xxxxxxxxx> > --- > > Changes since v1: > * Don't use guest_cpuid_has() in vmx_prepare_switch_to_{host,guest}(), > which are called from IRQ-disabled context (Chao Gao). > * Reset msr_guest_fred_rsp0 in __vmx_vcpu_reset() (Chao Gao). > --- > arch/x86/kvm/vmx/vmx.c | 17 +++++++++++++++++ > arch/x86/kvm/vmx/vmx.h | 2 ++ > 2 files changed, 19 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index b7b772183ee4..264378c3b784 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1337,6 +1337,16 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > } > > wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); > + > + if (guest_can_use(vcpu, X86_FEATURE_FRED)) { > + /* > + * MSR_IA32_FRED_RSP0 is top of task stack, which never changes. > + * Thus it should be initialized only once. Then grab the host value during vmx_hardware_setup(). And when you rebase on top of the latest kvm-x86/next, there's a handy dandy "struct kvm_host_values kvm_host" to track host MSR values (and similar state). You could also use that for MSR_IA32_FRED_CONFIG and MSR_IA32_FRED_STKLVLS. > + */ > + if (unlikely(vmx->msr_host_fred_rsp0 == 0)) > + vmx->msr_host_fred_rsp0 = read_msr(MSR_IA32_FRED_RSP0); > + wrmsrl(MSR_IA32_FRED_RSP0, vmx->msr_guest_fred_rsp0); > + } > #else > savesegment(fs, fs_sel); > savesegment(gs, gs_sel); > @@ -1381,6 +1391,11 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > invalidate_tss_limit(); > #ifdef CONFIG_X86_64 > wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base); > + > + if (guest_can_use(&vmx->vcpu, X86_FEATURE_FRED)) { > + vmx->msr_guest_fred_rsp0 = read_msr(MSR_IA32_FRED_RSP0); > + wrmsrl(MSR_IA32_FRED_RSP0, vmx->msr_host_fred_rsp0); > + } > #endif > load_fixmap_gdt(raw_smp_processor_id()); > vmx->guest_state_loaded = false; > @@ -4889,6 +4904,8 @@ static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu) > > #ifdef CONFIG_X86_64 > if (kvm_cpu_cap_has(X86_FEATURE_FRED)) { > + vmx->msr_guest_fred_rsp0 = 0; Eh, I wouldn't bother. Arguably it's better to use __kvm_set_msr(), and "vmx" is zero-allocated so this is unnecessary. The GUEST_IA32_FRED_* VMCS fields need to be explicitly initialized because the VMCS could (very theoretically) use a non-zero-based encoding scheme.