Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 05, 2022 at 11:54:01AM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:52AM -0600, Brijesh Singh wrote:
> > +/*
> > + * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
> > + * Firmware ABI, Revision 0.9, Section 7.1, Table 14.
> > + */
> > +struct snp_cpuid_fn {
> > +	u32 eax_in;
> > +	u32 ecx_in;
> > +	u64 xcr0_in;
> > +	u64 xss_in;
> 
> So what's the end result here:
> 
> -+	u64 __unused;
> -+	u64 __unused2;
> ++	u64 xcr0_in;
> ++	u64 xss_in;
> 
> those are not unused fields anymore but xcr0 and xss input values?
> 
> Looking at the FW abi doc, they're only mentioned in "Table 14.
> CPUID_FUNCTION Structure" that they're XCR0 and XSS at the time of the
> CPUID execution.
> 
> But those values are input values to what exactly, guest or firmware?

These are inputs to firmware, either through a MSG_CPUID_REQ guest
message that gets passed along by the HV to firmware for validation,
or, in this case, through SNP_LAUNCH_UPDATE when the HV passes the
initial CPUID table contents to firmware for validation before it
gets mapped into guest memory.

> 
> There's a typo in the FW doc, btw:
> 
> "The guest constructs an MSG_CPUID_REQ message as defined in Table 13.
> This message contains an array of CPUID function structures as defined
> in Table 13."
> 
> That second "Table" is 14 not 13.
> 
> So, if an array CPUID_FUNCTION[] is passed as part of an MSG_CPUID_REQ
> command, then, the two _IN variables contain what the guest received
> from the HV for XCR0 and XSS values. Which means, this is the guest
> asking the FW whether those values the HV gave the guest are kosher.
> 
> Am I close?

Most of that documentation is written in the context of the
MSG_CPUID_REQ guest message interface. In that case, a guest has been
provided a certain sets of CPUID values by KVM CPUID instruction
emulation (or potentially the CPUID table), and wants firmware to
validate whether all/some of them are valid for that particular
system/configuration.

In the case of 0xD subfunction 0x0 and 0x1, the CPUID leaf on real
hardware will change depending on what the guest sets in its actual
XCR0 control register (APM Volume 2 11.5.2) or IA32_XSS_MSR register,
whose combined bits are OR'd to form the XFEATURE_ENABLED_MASK.

CPUID leaf 0xD subfunctions 0x0 and 0x1 advertise the size of the
XSAVE area required for the features currently enabled in
XFEATURE_ENABLED_MASK via the EBX return value from the corresponding
CPUID instruction, so for firmware to validate whether that EBX value
is valid for a particular system/configuration, it needs to know
what the guest's currently XFEATURE_ENABLED_MASK is, so it needs to
know what the guest has set in its XCR0 and IA32_XSS_MSR registers.

That's where the XCR0_IN and XSS_IN inputs come into play: they are
inputs to the firmware so it can do the proper validation based on
the guest's current state / XFEATURE_ENABLED_MASK.

But that guest message interface will likely be deprecated at some
point in favor of the static CPUID table (SNP FW ABI 8.14.2.6),
since all the firmware validation has already been done in advance,
and any additional validation is redundant, so this series relies
purely on the CPUID table.

In the case of the CPUID table, we don't know what the guest's
XFEATURE_ENABLED_MASK is going to be at the time a CPUID instruction
is issued for 0xD:0x0,0xD,0x1, and those values will change as the
guest boots and begins enabling additional XSAVE features. The only
thing that's certain is that there needs to be at least one CPUID
table entry for 0xD:0x0 and 0xD:0x1 corresponding to the initial
XFEATURE_ENABLED_MASK, which will correspond to XCR0_IN=1/XSS_IN=0,
or XCR0_IN=3/XSS_IN=0 depending on the hypervisor/guest
implementation.

What to do about other combinations of XCR0_IN/XSS_IN gets a bit
weird, since encoding all those permutations would not scale well,
and alternative methods of trying to encode them in the table would
almost certainly result in lots of different incompatibile guest
implementations floating around.

So our recommendation has been for hypervisors to encode only the
0xD:0x0/0xD:0x1 entries corresponding to these initial guest states
in the CPUID table, and for guests to ignore the XCR0_IN/XSS_IN
values completely, and instead have the guest calculate the XSAVE
save area size dynamically (EBX). This has multiple benefits:

1) Guests that implement this approach would be compatible even
   with hypervisors that try to encode additional permutations of
   XCR0_IN/XSS_IN in the table based on their read of the spec
2) It is just as secure, since it requires only that the guest
   trust itself and the APM specification, which is the ultimate
   authority on what firmware would be validationa against
3) The other leaf registers, EAX/ECX/EDX are not dependent on
   XCR0_IN/XSS_IN/XFEATURE_ENABLED_MASK, and so any
   0xD:0x0/0xD:0x1 entry will do as the starting point for the
   calculation.

That's why in v8 these fields were documented as unused1/unused2.
But when you suggested that we should enforce 0 in that case, it
became clear we should adjust our recommendation to go ahead and
check for the specific XCR0_IN={1,3}/XSS_IN=0 values when
searching for the base 0xD:0x0/0xD:0x1 entry to use, because a
hypervisor that chooses to use XCR0_IN/XSS_IN is not out-of-spec,
and should be supported, and there isn't really any read of the
spec from the hypervisor perspective where XCR0_IN=0/XSS_IN=0
would be valid.

So for v9 I went ahead and added the XCR0_IN/XSS_IN checks, and
updated our recommendations (will send an updated version to
SNP mailing list by Monday) to not ignore them in the guest.

I added some comments in snp_cpuid_get_validated_func() and
snp_cpuid_calc_xsave_size() to clarify how XCR0_IN/XSS_IN are
being used there, but let me know if those need to be beefed up
a bit.

-Mike



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux