On Mon, Jul 17, 2023, Weijiang Yang wrote: > > On 6/24/2023 4:51 AM, Sean Christopherson wrote: > > > 1)Add Supervisor Shadow Stack� state support(i.e., XSS.bit12(CET_S)) into > > > kernel so that host can support guest Supervisor Shadow Stack MSRs in g/h FPU > > > context switch. > > If that's necessary for correct functionality, yes. ... > the Pros: > �- Super easy to implement for KVM. > �- Automatically avoids saving and restoring this data when the vmexit > �� is handled within KVM. > > the Cons: > �- Unnecessarily restores XFEATURE_CET_KERNEL when switching to > �� non-KVM task's userspace. > �- Forces allocating space for this state on all tasks, whether or not > �� they use KVM, and with likely zero users today and the near future. > �- Complicates the FPU optimization thinking by including things that > �� can have no affect on userspace in the FPU > > Given above reasons, I implemented guest CET supervisor states management > in KVM instead of adding a kernel patch for it. > > Below are 3 KVM patches to support it: > > Patch 1: Save/reload guest CET supervisor states when necessary: > > ======================================================================= > > commit 16147ede75dee29583b7d42a6621d10d55b63595 > Author: Yang Weijiang <weijiang.yang@xxxxxxxxx> > Date:�� Tue Jul 11 02:26:17 2023 -0400 > > ��� KVM:x86: Make guest supervisor states as non-XSAVE managed > > ��� Save and reload guest CET supervisor states, i.e.,PL{0,1,2}_SSP, > ��� when vCPU context is being swapped before and after userspace > ��� <->kernel entry, also do the same operation when vCPU is sched-in > ��� or sched-out. ... > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e2c549f147a5..7d9cfb7e2fe8 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11212,6 +11212,31 @@ static void kvm_put_guest_fpu(struct kvm_vcpu > *vcpu) > ������� trace_kvm_fpu(0); > �} > > +static void kvm_save_cet_supervisor_ssp(struct kvm_vcpu *vcpu) > +{ > +������ preempt_disable(); > +������ if (unlikely(guest_can_use(vcpu, X86_FEATURE_SHSTK))) { > +�������������� rdmsrl(MSR_IA32_PL0_SSP, vcpu->arch.cet_s_ssp[0]); > +�������������� rdmsrl(MSR_IA32_PL1_SSP, vcpu->arch.cet_s_ssp[1]); > +�������������� rdmsrl(MSR_IA32_PL2_SSP, vcpu->arch.cet_s_ssp[2]); > +�������������� wrmsrl(MSR_IA32_PL0_SSP, 0); > +�������������� wrmsrl(MSR_IA32_PL1_SSP, 0); > +�������������� wrmsrl(MSR_IA32_PL2_SSP, 0); > +������ } > +������ preempt_enable(); > +} > + > +static void kvm_reload_cet_supervisor_ssp(struct kvm_vcpu *vcpu) > +{ > +������ preempt_disable(); > +������ if (unlikely(guest_can_use(vcpu, X86_FEATURE_SHSTK))) { > +�������������� wrmsrl(MSR_IA32_PL0_SSP, vcpu->arch.cet_s_ssp[0]); > +�������������� wrmsrl(MSR_IA32_PL1_SSP, vcpu->arch.cet_s_ssp[1]); > +�������������� wrmsrl(MSR_IA32_PL2_SSP, vcpu->arch.cet_s_ssp[2]); > +������ } > +������ preempt_enable(); > +} My understanding is that PL[0-2]_SSP are used only on transitions to the corresponding privilege level from a *different* privilege level. That means KVM should be able to utilize the user_return_msr framework to load the host values. Though if Linux ever supports SSS, I'm guessing the core kernel will have some sort of mechanism to defer loading MSR_IA32_PL0_SSP until an exit to userspace, e.g. to avoid having to write PL0_SSP, which will presumably be per-task, on every context switch. But note my original wording: **If that's necessary** If nothing in the host ever consumes those MSRs, i.e. if SSS is NOT enabled in IA32_S_CET, then running host stuff with guest values should be ok. KVM only needs to guarantee that it doesn't leak values between guests. But that should Just Work, e.g. KVM should load the new vCPU's values if SHSTK is exposed to the guest, and intercept (to inject #GP) if SHSTK is not exposed to the guest. And regardless of what the mechanism ends up managing SSP MSRs, it should only ever touch PL0_SSP, because Linux never runs anything at CPL1 or CPL2, i.e. will never consume PL{1,2}_SSP. Am I missing something?