Re: [PATCH 05/35] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/30/22 13:18, Rick Edgecombe wrote:
> From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> 
> Control-flow Enforcement Technology (CET) introduces these MSRs:
> 
>     MSR_IA32_U_CET (user-mode CET settings),
>     MSR_IA32_PL3_SSP (user-mode shadow stack pointer),
> 
>     MSR_IA32_PL0_SSP (kernel-mode shadow stack pointer),
>     MSR_IA32_PL1_SSP (Privilege Level 1 shadow stack pointer),
>     MSR_IA32_PL2_SSP (Privilege Level 2 shadow stack pointer),
>     MSR_IA32_S_CET (kernel-mode CET settings),
>     MSR_IA32_INT_SSP_TAB (exception shadow stack table).

To be honest, I'm not sure this is very valuable.  It's *VERY* close to
the exact information in the structure definitions.  It's also not
obviously related to XSAVE.  It's more of the "what" this patch does
than the "why".  Good changelogs talk about "why".

> The two user-mode MSRs belong to XFEATURE_CET_USER.  The first three of
> kernel-mode MSRs belong to XFEATURE_CET_KERNEL.  Both XSAVES states are
> supervisor states.  This means that there is no direct, unprivileged access
> to these states, making it harder for an attacker to subvert CET.

Forgive me while I go into changelog lecture mode for a moment.

I was constantly looking up at the list of MSRs and trying to reconcile
them with this paragraph.  Imagine if you had started out this changelog
by saying:

	Shadow stack register state can be managed with XSAVE.  The
	registers can logically be separated into two groups:

		* Registers controlling user-mode operation
		* Registers controlling kernel-mode operation

	The architecture has two new XSAVE state components: one for
	each group of registers.  This _lets_ an OS manage them
	separately if it chooses.  Linux chooses to ... <explain the
	design choice here, or why we don't care yet>.

	Both XSAVE state components are supervisor states, even the
	state controlling user-mode operation.  This is a departure from
	earlier features like protection keys where the PKRU state is
	a normal user (non-supervisor) state.  Having the user state be	
	supervisor-managed ensures there is no direct, unprivileged
	access to it, making it harder for an attacker to subvert CET.

Also, IBT gunk is in here too, right?  Let's at least *mention* that in
the changelog.

...
>  /* All supervisor states including supported and unsupported states. */
>  #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 3faf0f97edb1..0ee77ce4c753 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -362,6 +362,26 @@
>  
>  
>  #define MSR_CORE_PERF_LIMIT_REASONS	0x00000690
> +
> +/* Control-flow Enforcement Technology MSRs */
> +#define MSR_IA32_U_CET			0x000006a0 /* user mode cet setting */
> +#define MSR_IA32_S_CET			0x000006a2 /* kernel mode cet setting */
> +#define CET_SHSTK_EN			BIT_ULL(0)
> +#define CET_WRSS_EN			BIT_ULL(1)
> +#define CET_ENDBR_EN			BIT_ULL(2)
> +#define CET_LEG_IW_EN			BIT_ULL(3)
> +#define CET_NO_TRACK_EN			BIT_ULL(4)
> +#define CET_SUPPRESS_DISABLE		BIT_ULL(5)
> +#define CET_RESERVED			(BIT_ULL(6) | BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9))

Would GENMASK_ULL() look any nicer here?  I guess it's pretty clear
as-is that bits 6->9 are reserved.

> +#define CET_SUPPRESS			BIT_ULL(10)
> +#define CET_WAIT_ENDBR			BIT_ULL(11)

Are those bit fields common for both registers?  It might be worth a
comment to mention that.

> +#define MSR_IA32_PL0_SSP		0x000006a4 /* kernel shadow stack pointer */
> +#define MSR_IA32_PL1_SSP		0x000006a5 /* ring-1 shadow stack pointer */
> +#define MSR_IA32_PL2_SSP		0x000006a6 /* ring-2 shadow stack pointer */

Are PL1/2 ever used in this implementation?  If not, let's axe these
definitions.

> +#define MSR_IA32_PL3_SSP		0x000006a7 /* user shadow stack pointer */
> +#define MSR_IA32_INT_SSP_TAB		0x000006a8 /* exception shadow stack table */
> +
>  #define MSR_GFX_PERF_LIMIT_REASONS	0x000006B0
>  #define MSR_RING_PERF_LIMIT_REASONS	0x000006B1
>  
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 02b3ddaf4f75..44397202762b 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -50,6 +50,8 @@ static const char *xfeature_names[] =
>  	"Processor Trace (unused)"	,
>  	"Protection Keys User registers",
>  	"PASID state",
> +	"Control-flow User registers"	,
> +	"Control-flow Kernel registers"	,
>  	"unknown xstate feature"	,
>  	"unknown xstate feature"	,
>  	"unknown xstate feature"	,
> @@ -73,6 +75,8 @@ static unsigned short xsave_cpuid_features[] __initdata = {
>  	[XFEATURE_PT_UNIMPLEMENTED_SO_FAR]	= X86_FEATURE_INTEL_PT,
>  	[XFEATURE_PKRU]				= X86_FEATURE_PKU,
>  	[XFEATURE_PASID]			= X86_FEATURE_ENQCMD,
> +	[XFEATURE_CET_USER]			= X86_FEATURE_SHSTK,
> +	[XFEATURE_CET_KERNEL]			= X86_FEATURE_SHSTK,
>  	[XFEATURE_XTILE_CFG]			= X86_FEATURE_AMX_TILE,
>  	[XFEATURE_XTILE_DATA]			= X86_FEATURE_AMX_TILE,
>  };
> @@ -250,6 +254,8 @@ static void __init print_xstate_features(void)
>  	print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
>  	print_xstate_feature(XFEATURE_MASK_PKRU);
>  	print_xstate_feature(XFEATURE_MASK_PASID);
> +	print_xstate_feature(XFEATURE_MASK_CET_USER);
> +	print_xstate_feature(XFEATURE_MASK_CET_KERNEL);
>  	print_xstate_feature(XFEATURE_MASK_XTILE_CFG);
>  	print_xstate_feature(XFEATURE_MASK_XTILE_DATA);
>  }
> @@ -405,6 +411,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate)
>  	 XFEATURE_MASK_BNDREGS |		\
>  	 XFEATURE_MASK_BNDCSR |			\
>  	 XFEATURE_MASK_PASID |			\
> +	 XFEATURE_MASK_CET_USER |		\
>  	 XFEATURE_MASK_XTILE)
>  
>  /*
> @@ -621,6 +628,8 @@ static bool __init check_xstate_against_struct(int nr)
>  	XCHECK_SZ(sz, nr, XFEATURE_PKRU,      struct pkru_state);
>  	XCHECK_SZ(sz, nr, XFEATURE_PASID,     struct ia32_pasid_state);
>  	XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg);
> +	XCHECK_SZ(sz, nr, XFEATURE_CET_USER,   struct cet_user_state);
> +	XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct cet_kernel_state);
>  
>  	/* The tile data size varies between implementations. */
>  	if (nr == XFEATURE_XTILE_DATA)
> @@ -634,7 +643,9 @@ static bool __init check_xstate_against_struct(int nr)
>  	if ((nr < XFEATURE_YMM) ||
>  	    (nr >= XFEATURE_MAX) ||
>  	    (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) ||
> -	    ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_RSRVD_COMP_16))) {
> +	    (nr == XFEATURE_RSRVD_COMP_13) ||
> +	    (nr == XFEATURE_RSRVD_COMP_14) ||
> +	    (nr == XFEATURE_RSRVD_COMP_16)) {
>  		WARN_ONCE(1, "no structure for xstate: %d\n", nr);
>  		XSTATE_WARN_ON(1);
>  		return false;

That if() is getting unweildy.  While I generally despise macros
implicitly modifying variables, this might be worth it.  We could have a
local function variable:

	bool feature_checked = false;

and then muck with it in the macro:

#define XCHECK_SZ(sz, nr, nr_macro, __struct) do {
	if (nr == nr_macro)) {
		feature_checked = true;
		if (WARN_ONCE(sz != sizeof(__struct), ... ) {
			__xstate_dump_leaves();
		}
        }
} while (0)

Then the if() just makes sure the feature was checked instead of
checking for reserved features explicitly.  We could also do:

	bool c = false;

	...

        c |= XCHECK_SZ(sz, nr, XFEATURE_YMM,       struct ymmh_struct);
        c |= XCHECK_SZ(sz, nr, XFEATURE_BNDREGS,   struct ...
        c |= XCHECK_SZ(sz, nr, XFEATURE_BNDCSR,    struct ...
	...

but that starts to run into 80 columns.  Those are both nice because
they mean you don't have to maintain a list of reserved features in the
code.  Another option would be to define a:

bool xfeature_is_reserved(int nr)
{
	switch (nr) {
		case XFEATURE_RSRVD_COMP_13:
		...

so the if() looks nicer and won't grow; the function will grow instead.

Either way, I think this needs some refactoring.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux