Re: [PATCHv6 3/4] x86/tdx: Dynamically disable SEPT violations from causing #VEs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 28, 2024 at 05:27:32PM +0300, Nikolay Borisov wrote:
> 
> 
> On 28.08.24 г. 12:35 ч., Kirill A. Shutemov wrote:
> > Memory access #VEs are hard for Linux to handle in contexts like the
> > entry code or NMIs.  But other OSes need them for functionality.
> > There's a static (pre-guest-boot) way for a VMM to choose one or the
> > other.  But VMMs don't always know which OS they are booting, so they
> > choose to deliver those #VEs so the "other" OSes will work.  That,
> > unfortunately has left us in the lurch and exposed to these
> > hard-to-handle #VEs.
> > 
> > The TDX module has introduced a new feature. Even if the static
> > configuration is set to "send nasty #VEs", the kernel can dynamically
> > request that they be disabled. Once they are disabled, access to private
> > memory that is not in the Mapped state in the Secure-EPT (SEPT) will
> > result in an exit to the VMM rather than injecting a #VE.
> > 
> > Check if the feature is available and disable SEPT #VE if possible.
> > 
> > If the TD is allowed to disable/enable SEPT #VEs, the ATTR_SEPT_VE_DISABLE
> > attribute is no longer reliable. It reflects the initial state of the
> > control for the TD, but it will not be updated if someone (e.g. bootloader)
> > changes it before the kernel starts. Kernel must check TDCS_TD_CTLS bit to
> > determine if SEPT #VEs are enabled or disabled.
> 
> LGTM. However 2 minor suggestions which might be worth addressing.
> 
> Reviewed-by: Nikolay Borisov <nik.borisov@xxxxxxxx>
> 
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > Fixes: 373e715e31bf ("x86/tdx: Panic on bad configs that #VE on "private" memory access")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Acked-by: Kai Huang <kai.huang@xxxxxxxxx>
> > ---
> >   arch/x86/coco/tdx/tdx.c           | 76 ++++++++++++++++++++++++-------
> >   arch/x86/include/asm/shared/tdx.h | 10 +++-
> >   2 files changed, 69 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> > index 08ce488b54d0..f969f4f5ebf8 100644
> > --- a/arch/x86/coco/tdx/tdx.c
> > +++ b/arch/x86/coco/tdx/tdx.c
> > @@ -78,7 +78,7 @@ static inline void tdcall(u64 fn, struct tdx_module_args *args)
> >   }
> >   /* Read TD-scoped metadata */
> > -static inline u64 __maybe_unused tdg_vm_rd(u64 field, u64 *value)
> > +static inline u64 tdg_vm_rd(u64 field, u64 *value)
> >   {
> >   	struct tdx_module_args args = {
> >   		.rdx = field,
> > @@ -193,6 +193,62 @@ static void __noreturn tdx_panic(const char *msg)
> >   		__tdx_hypercall(&args);
> >   }
> > +/*
> > + * The kernel cannot handle #VEs when accessing normal kernel memory. Ensure
> > + * that no #VE will be delivered for accesses to TD-private memory.
> > + *
> > + * TDX 1.0 does not allow the guest to disable SEPT #VE on its own. The VMM
> > + * controls if the guest will receive such #VE with TD attribute
> > + * ATTR_SEPT_VE_DISABLE.
> > + *
> > + * Newer TDX modules allow the guest to control if it wants to receive SEPT
> > + * violation #VEs.
> > + *
> > + * Check if the feature is available and disable SEPT #VE if possible.
> > + *
> > + * If the TD is allowed to disable/enable SEPT #VEs, the ATTR_SEPT_VE_DISABLE
> > + * attribute is no longer reliable. It reflects the initial state of the
> > + * control for the TD, but it will not be updated if someone (e.g. bootloader)
> > + * changes it before the kernel starts. Kernel must check TDCS_TD_CTLS bit to
> > + * determine if SEPT #VEs are enabled or disabled.
> > + */
> > +static void disable_sept_ve(u64 td_attr)
> > +{
> > +	const char *msg = "TD misconfiguration: SEPT #VE has to be disabled";
> > +	bool debug = td_attr & ATTR_DEBUG;
> > +	u64 config, controls;
> > +
> > +	/* Is this TD allowed to disable SEPT #VE */
> > +	tdg_vm_rd(TDCS_CONFIG_FLAGS, &config);
> > +	if (!(config & TDCS_CONFIG_FLEXIBLE_PENDING_VE)) {
> 
> Should you check for the presence of those controls in in
> TDX_FEATURES0.PENDING_EPT_VIOLATION_V2 ? I.e perhaps this code can be put in
> the same function that checks the presence of RBP_NO_MOD in a different
> series by Kai Huang?

No. TDX_FEATURES0 check is not required. This bit in TDCS_CONFIG_FLAGS
cannot be anything else than FLEXIBLE_PENDING_VE and checking only this
bit is enough.

> 
> 
> > +		/* No SEPT #VE controls for the guest: check the attribute */
> > +		if (td_attr & ATTR_SEPT_VE_DISABLE)
> > +			return;
> 
> nit: Given that we expect most guests to actually have this attribute set
> perhaps moving this check at the top of the function will cause it exit
> early more often than not?

The attribute is not reliable source if flexible VE controls are present
as I mentioned in the commit message. We can only rely on it if there's no
TDCS_CONFIG_FLEXIBLE_PENDING_VE.

> > +
> > +		/* Relax SEPT_VE_DISABLE check for debug TD for backtraces */
> > +		if (debug)
> > +			pr_warn("%s\n", msg);
> > +		else
> > +			tdx_panic(msg);
> > +		return;
> > +	}
> > +
> > +	/* Check if SEPT #VE has been disabled before us */
> > +	tdg_vm_rd(TDCS_TD_CTLS, &controls);
> > +	if (controls & TD_CTLS_PENDING_VE_DISABLE)
> > +		return;
> > +
> > +	/* Keep #VEs enabled for splats in debugging environments */
> > +	if (debug)
> > +		return;
> > +
> > +	/* Disable SEPT #VEs */
> > +	tdg_vm_wr(TDCS_TD_CTLS, TD_CTLS_PENDING_VE_DISABLE,
> > +		  TD_CTLS_PENDING_VE_DISABLE);
> > +
> > +	return;
> > +}
> > +
> >   static void tdx_setup(u64 *cc_mask)
> >   {
> >   	struct tdx_module_args args = {};
> 
> <snip>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux