Re: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

Martin Radev <martin.b.radev@xxxxxxxxx> · Wed, 10 Mar 2021 19:10:49 +0100

On Wed, Mar 10, 2021 at 09:51:48AM -0800, Sean Christopherson wrote:
> On Wed, Mar 10, 2021, Martin Radev wrote:
> > On Wed, Mar 10, 2021 at 08:08:37AM -0800, Sean Christopherson wrote:
> > > On Wed, Mar 10, 2021, Joerg Roedel wrote:
> > > > +	/*
> > > > +	 * Sanity check CPUID results from the Hypervisor. See comment in
> > > > +	 * do_vc_no_ghcb() for more details on why this is necessary.
> > > > +	 */
> > > > +
> > > > +	/* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */
> > > 
> > > This check is flawed, as is the existing check in 64-bit boot.  Or I guess more
> > > accurately, the check in get_sev_encryption_bit() is flawed.  AIUI, SEV-ES
> > > doesn't require the hypervisor to intercept CPUID.  A malicious hypervisor can
> > > temporarily pass-through CPUID to bypass the CPUID[1].ECX[31] check.
> > 
> > If erroneous information is provided, either through interception or without, there's
> > this check which is performed every time a new page table is set in the early linux stages:
> > https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/x86/kernel/sev_verify_cbit.S#L22
> > 
> > This should lead to a halt if corruption is detected, unless I'm overlooking something.
> > Please share more info.
> 
> That check is predicated on sme_me_mask != 0, sme_me_mask is set based on the
> result of get_sev_encryption_bit(), and that returns '0' if CPUID[1].ECX[31] is
> '0'.
> 
> sme_enable() also appears to have the same issue, as CPUID[1].ECX[31]=0 would
> cause it to check for SME instead of SEV, and the hypervisor can simply return
> 0 for a VMGEXIT to get MSR_K8_SYSCFG.
> 
> I've no idea if the guest would actually survive with a bogus sme_me_mask, but
> relying on CPUID[1] to #VC is flawed.
> 
> Since MSR_AMD64_SEV is non-interceptable, that seems like it should be the
> canonical way to detect SEV/SEV-ES.  The only complication seems to be handling
> #GP faults on the RDMSR in early boot.
> 
> > > The hypervisor likely has access to the guest firmware source, so it
> > > wouldn't be difficult for the hypervisor to disable CPUID interception once
> > > it detects that firmware is handing over control to the kernel.
> > > 
> > 
> > You probably don't even need to know the firmware for that. There the option
> > to set CR* changes to cause #AE which probably gives away enough information.

I see what you mean but I never tried out disabling interception for cpuid.
There was the idea of checking for bogus information in the VC handler, but what
you suggested would bypass it, I guess.

If the C-bit is not set and memory gets interpreted as unencrypted, then the HV
can gain code execution easily by means of ROP and then switch to the OVMF page
table to easily do proper payload injection.

If interested, check video at https://fosdem.org/2021/schedule/event/tee_sev_es/
on minute 15.