Re: [PATCH v5 03/16] x86/tdx: Exclude Shared bit from physical_mask

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 05, 2021 at 10:11:48PM +0000, Sean Christopherson wrote:
> On Fri, Oct 08, 2021, Kuppuswamy Sathyanarayanan wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> > 
> > Just like MKTME, TDX reassigns bits of the physical address for
> > metadata.  MKTME used several bits for an encryption KeyID. TDX
> > uses a single bit in guests to communicate whether a physical page
> > should be protected by TDX as private memory (bit set to 0) or
> > unprotected and shared with the VMM (bit set to 1).
> > 
> > Add a helper, tdx_shared_mask() to generate the mask.  The processor
> > enumerates its physical address width to include the shared bit, which
> > means it gets included in __PHYSICAL_MASK by default.
> 
> This is incorrect.  The shared bit _may_ be a legal PA bit, but AIUI it's not a
> hard requirement.

Good point, will fix.

> > Remove the shared mask from 'physical_mask' since any bits in
> > tdx_shared_mask() are not used for physical addresses in page table
> > entries.
> 
> ...
> 
> > @@ -94,6 +100,9 @@ static void tdx_get_info(void)
> >  
> >  	td_info.gpa_width = out.rcx & GENMASK(5, 0);
> >  	td_info.attributes = out.rdx;
> > +
> > +	/* Exclude Shared bit from the __PHYSICAL_MASK */
> > +	physical_mask &= ~tdx_shared_mask();
> 
> This is insufficient, though it's not really the fault of this patch, the specs
> themselves botch this whole thing.
> 
> The TDX Module spec explicitly states that GPAs above GPAW are considered reserved.
> 
>     10.11.1. GPAW-Relate EPT Violations
>     GPA bits higher than the SHARED bit are considered reserved and must be 0.
>     Address translation with any of the reserved bits set to 1 cause a #PF with
>     PFEC (Page Fault Error Code) RSVD bit set.
> 
> But this is contradicted by the architectural extensions spec, which states that
> a GPA that satisfies MAXPA >= GPA > GPAW "can" cause an EPT violation, not #PF.
> Note, this section also appears to have a bug, as it states that GPA bit 47 is
> both the SHARED bit and reserved.  I assume that blurb is intended to clarify
> that bit 47 _would_ be reserved if it weren't the SHARED bit, but because it's
> the shared bit it's ok to access.
> 
>     1.4.2
>     Guest Physical Address Translation
>     If the CPU's maximum physical-address width (MAXPA) is 52 and the guest physical
>     address width is configured to be 48, accesses with GPA bits 51:48 not all being
>     0 can cause an EPT-violation, where such EPT-violations are not mutated to #VE,
>     even if the “EPT-violations #VE” execution control is 1.
> 
>     If the CPU's physical-address width (MAXPA) is less than 48 and the SHARED bit
>     is configured to be in bit position 47, GPA bit 47 would be reserved, and GPA
>        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                    
>     bits 46:MAXPA would be reserved. On such CPUs, setting bits 51:48 or bits
>     46:MAXPA in any paging structure can cause a reserved bit page fault on access.
> 
> The Module spec also calls out that the effective GPA is not to be confused with
> MAXPA, which combined with the above blurb about MAXPA < GPAW, suggests that MAXPA
> is enumerated separately by design so that the guest doesn't incorrectly think
> 46:MAXPA are usable.  But that is problematic for the case where MAXPA > GPAW.
> 
>     The effective GPA width (in bits) for this TD (do not confuse with MAXPA).
>     SHARED bit is at GPA bit GPAW-1.
> 
> I can't find the exact reference, but the TDX module always passes through host's
> MAXPHYADDR.  As it pertains to this patch, just doing
> 
> 	physical_mask &= ~tdx_shared_mask()
> 
> means that a guest running with GPAW=0 and MAXPHYADDR>48 will have a discontiguous
> physical_mask, and could access "reserved" memory.  If the VMM defines legal memory
> with bits [MAXPHYADDR:48]!=0, explosions may ensue.  That's arguably a VMM bug, but
> given that the VMM is untrusted I think the guest should be paranoid when handling
> the SHARED bit.  I also don't know that the kernel will play nice with a discontiguous
> mask.

I expect it to be buggy.

> Specs aside, unless Intel makes a hardware change to treat GPAW as guest.MAXPHYADDR,
> or the TDX Module emulates on EPT violations to inject #PF(RSVD) when appropriate,
> this mess isn't going to be truly fixed from the guest perspective.
> 
> So, IMO all bits >= GPAW should be cleared, and the kernel should warn and/or
> refuse to boot if the host has defined legal memory in that range.

Right. But only >= GPAW-1 as shared bit is the MSB within GPAW:

	physical_mask &= GENMASK_ULL(td_info.gpa_width - 2, 0);

'2' here smells bad, but well...

Given that physical_mask is now contiguous we can truncate anything from
e820 that cannot be addressed with adjusted __PHYSICAL_MASK:

iff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index bc0657f0deed..16d57a8769e8 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -833,6 +833,9 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn, enum e820_type
 	unsigned long last_pfn = 0;
 	unsigned long max_arch_pfn = MAX_ARCH_PFN;

+	if (max_arch_pfn > PHYS_PFN(__PHYSICAL_MASK + 1))
+		max_arch_pfn = PHYS_PFN(__PHYSICAL_MASK + 1);
+
 	for (i = 0; i < e820_table->nr_entries; i++) {
 		struct e820_entry *entry = &e820_table->entries[i];
 		unsigned long start_pfn;

Does it look reasonable?

> FWIW, from a VMM perspective, I'm pretty sure the only sane approach is to force
> GPAW=1, a.k.a. SHARED bit == 51, if host.MAXPHYADDR>=49.  But on the guest side,
> I think we should be paranoid.

-- 
 Kirill A. Shutemov



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux