On Mon, 2022-02-07 at 15:28 -0800, Dave Hansen wrote: > On 1/30/22 13:18, Rick Edgecombe wrote: > > From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> > > > > Control-flow Enforcement Technology (CET) introduces these MSRs: > > > > MSR_IA32_U_CET (user-mode CET settings), > > MSR_IA32_PL3_SSP (user-mode shadow stack pointer), > > > > MSR_IA32_PL0_SSP (kernel-mode shadow stack pointer), > > MSR_IA32_PL1_SSP (Privilege Level 1 shadow stack pointer), > > MSR_IA32_PL2_SSP (Privilege Level 2 shadow stack pointer), > > MSR_IA32_S_CET (kernel-mode CET settings), > > MSR_IA32_INT_SSP_TAB (exception shadow stack table). > > To be honest, I'm not sure this is very valuable. It's *VERY* close > to > the exact information in the structure definitions. It's also not > obviously related to XSAVE. It's more of the "what" this patch does > than the "why". Good changelogs talk about "why". Ok I'll look at re-wording this. > > > The two user-mode MSRs belong to XFEATURE_CET_USER. The first > > three of > > kernel-mode MSRs belong to XFEATURE_CET_KERNEL. Both XSAVES states > > are > > supervisor states. This means that there is no direct, > > unprivileged access > > to these states, making it harder for an attacker to subvert CET. Oh, well I guess this *is* mentioned elsewhere, than in patch 3. > > Forgive me while I go into changelog lecture mode for a moment. > > I was constantly looking up at the list of MSRs and trying to > reconcile > them with this paragraph. Imagine if you had started out this > changelog > by saying: > > Shadow stack register state can be managed with XSAVE. The > registers can logically be separated into two groups: > > * Registers controlling user-mode operation > * Registers controlling kernel-mode operation > > The architecture has two new XSAVE state components: one for > each group of registers. This _lets_ an OS manage them > separately if it chooses. Linux chooses to ... <explain the > design choice here, or why we don't care yet>. > > Both XSAVE state components are supervisor states, even the > state controlling user-mode operation. This is a departure > from > earlier features like protection keys where the PKRU state is > a normal user (non-supervisor) state. Having the user state be > > supervisor-managed ensures there is no direct, unprivileged > access to it, making it harder for an attacker to subvert CET. > > Also, IBT gunk is in here too, right? Let's at least *mention* that > in > the changelog. We can remove the IBT stuff if its better. I always appreciate finding the unused features in headers when hacking around. But it all adds to build time slightly I guess. > > ... > > /* All supervisor states including supported and unsupported > > states. */ > > #define XFEATURE_MASK_SUPERVISOR_ALL > > (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ > > diff --git a/arch/x86/include/asm/msr-index.h > > b/arch/x86/include/asm/msr-index.h > > index 3faf0f97edb1..0ee77ce4c753 100644 > > --- a/arch/x86/include/asm/msr-index.h > > +++ b/arch/x86/include/asm/msr-index.h > > @@ -362,6 +362,26 @@ > > > > > > #define MSR_CORE_PERF_LIMIT_REASONS 0x00000690 > > + > > +/* Control-flow Enforcement Technology MSRs */ > > +#define MSR_IA32_U_CET 0x000006a0 /* user mode > > cet setting */ > > +#define MSR_IA32_S_CET 0x000006a2 /* kernel > > mode cet setting */ > > +#define CET_SHSTK_EN BIT_ULL(0) > > +#define CET_WRSS_EN BIT_ULL(1) > > +#define CET_ENDBR_EN BIT_ULL(2) > > +#define CET_LEG_IW_EN BIT_ULL(3) > > +#define CET_NO_TRACK_EN BIT_ULL(4) > > +#define CET_SUPPRESS_DISABLE BIT_ULL(5) > > +#define CET_RESERVED (BIT_ULL(6) | > > BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9)) > > Would GENMASK_ULL() look any nicer here? I guess it's pretty clear > as-is that bits 6->9 are reserved. Hmm, visually I think it might be easier to catch that you need to remove a reserved bit if it is being added after becoming unreserved some day. > > > +#define CET_SUPPRESS BIT_ULL(10) > > +#define CET_WAIT_ENDBR BIT_ULL(11) > > Are those bit fields common for both registers? It might be worth a > comment to mention that. Yes, I'll mention that. > > > +#define MSR_IA32_PL0_SSP 0x000006a4 /* kernel shadow > > stack pointer */ > > +#define MSR_IA32_PL1_SSP 0x000006a5 /* ring-1 shadow > > stack pointer */ > > +#define MSR_IA32_PL2_SSP 0x000006a6 /* ring-2 shadow > > stack pointer */ > > Are PL1/2 ever used in this implementation? If not, let's axe these > definitions. They are not used. Ok. > > > +#define MSR_IA32_PL3_SSP 0x000006a7 /* user shadow stack > > pointer */ > > +#define MSR_IA32_INT_SSP_TAB 0x000006a8 /* exception > > shadow stack table */ > > + > > #define MSR_GFX_PERF_LIMIT_REASONS 0x000006B0 > > #define MSR_RING_PERF_LIMIT_REASONS 0x000006B1 > > > > diff --git a/arch/x86/kernel/fpu/xstate.c > > b/arch/x86/kernel/fpu/xstate.c > > index 02b3ddaf4f75..44397202762b 100644 > > --- a/arch/x86/kernel/fpu/xstate.c > > +++ b/arch/x86/kernel/fpu/xstate.c > > @@ -50,6 +50,8 @@ static const char *xfeature_names[] = > > "Processor Trace (unused)" , > > "Protection Keys User registers", > > "PASID state", > > + "Control-flow User registers" , > > + "Control-flow Kernel registers" , > > "unknown xstate feature" , > > "unknown xstate feature" , > > "unknown xstate feature" , > > @@ -73,6 +75,8 @@ static unsigned short xsave_cpuid_features[] > > __initdata = { > > [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, > > [XFEATURE_PKRU] = X86_FEATURE_PKU, > > [XFEATURE_PASID] = X86_FEATURE_ENQCMD, > > + [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, > > + [XFEATURE_CET_KERNEL] = > > X86_FEATURE_SHSTK, > > [XFEATURE_XTILE_CFG] = > > X86_FEATURE_AMX_TILE, > > [XFEATURE_XTILE_DATA] = > > X86_FEATURE_AMX_TILE, > > }; > > @@ -250,6 +254,8 @@ static void __init print_xstate_features(void) > > print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); > > print_xstate_feature(XFEATURE_MASK_PKRU); > > print_xstate_feature(XFEATURE_MASK_PASID); > > + print_xstate_feature(XFEATURE_MASK_CET_USER); > > + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); > > print_xstate_feature(XFEATURE_MASK_XTILE_CFG); > > print_xstate_feature(XFEATURE_MASK_XTILE_DATA); > > } > > @@ -405,6 +411,7 @@ static __init void os_xrstor_booting(struct > > xregs_state *xstate) > > XFEATURE_MASK_BNDREGS | \ > > XFEATURE_MASK_BNDCSR | \ > > XFEATURE_MASK_PASID | \ > > + XFEATURE_MASK_CET_USER | \ > > XFEATURE_MASK_XTILE) > > > > /* > > @@ -621,6 +628,8 @@ static bool __init > > check_xstate_against_struct(int nr) > > XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); > > XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); > > XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg); > > + XCHECK_SZ(sz, nr, XFEATURE_CET_USER, struct cet_user_state); > > + XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct > > cet_kernel_state); > > > > /* The tile data size varies between implementations. */ > > if (nr == XFEATURE_XTILE_DATA) > > @@ -634,7 +643,9 @@ static bool __init > > check_xstate_against_struct(int nr) > > if ((nr < XFEATURE_YMM) || > > (nr >= XFEATURE_MAX) || > > (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || > > - ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= > > XFEATURE_RSRVD_COMP_16))) { > > + (nr == XFEATURE_RSRVD_COMP_13) || > > + (nr == XFEATURE_RSRVD_COMP_14) || > > + (nr == XFEATURE_RSRVD_COMP_16)) { > > WARN_ONCE(1, "no structure for xstate: %d\n", nr); > > XSTATE_WARN_ON(1); > > return false; > > That if() is getting unweildy. While I generally despise macros > implicitly modifying variables, this might be worth it. We could > have a > local function variable: > > bool feature_checked = false; > > and then muck with it in the macro: > > #define XCHECK_SZ(sz, nr, nr_macro, __struct) do { > if (nr == nr_macro)) { > feature_checked = true; > if (WARN_ONCE(sz != sizeof(__struct), ... ) { > __xstate_dump_leaves(); > } > } > } while (0) > > Then the if() just makes sure the feature was checked instead of > checking for reserved features explicitly. We could also do: > > bool c = false; > > ... > > c |= XCHECK_SZ(sz, nr, XFEATURE_YMM, struct > ymmh_struct); > c |= XCHECK_SZ(sz, nr, XFEATURE_BNDREGS, struct ... > c |= XCHECK_SZ(sz, nr, XFEATURE_BNDCSR, struct ... > ... > > but that starts to run into 80 columns. Those are both nice because > they mean you don't have to maintain a list of reserved features in > the > code. Another option would be to define a: > > bool xfeature_is_reserved(int nr) > { > switch (nr) { > case XFEATURE_RSRVD_COMP_13: > ... > > so the if() looks nicer and won't grow; the function will grow > instead. > > Either way, I think this needs some refactoring. Yes, this makes sense. I'll play around with it.