Re: [PATCH v12 32/46] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers

Michael Roth <michael.roth@xxxxxxx> · Thu, 10 Mar 2022 15:25:04 -0600

On Thu, Mar 10, 2022 at 07:51:17AM -0700, Peter Gonda wrote:
> ()
> 
> On Mon, Mar 7, 2022 at 2:35 PM Brijesh Singh <brijesh.singh@xxxxxxx> wrote:
> > +static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)
> > +{
> > +       struct cpuid_leaf leaf_hv = *leaf;
> > +
> > +       switch (leaf->fn) {
> > +       case 0x1:
> > +               snp_cpuid_hv(&leaf_hv);
> > +
> > +               /* initial APIC ID */
> > +               leaf->ebx = (leaf_hv.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0));
> > +               /* APIC enabled bit */
> > +               leaf->edx = (leaf_hv.edx & BIT(9)) | (leaf->edx & ~BIT(9));
> > +
> > +               /* OSXSAVE enabled bit */
> > +               if (native_read_cr4() & X86_CR4_OSXSAVE)
> > +                       leaf->ecx |= BIT(27);
> > +               break;
> > +       case 0x7:
> > +               /* OSPKE enabled bit */
> > +               leaf->ecx &= ~BIT(4);
> > +               if (native_read_cr4() & X86_CR4_PKE)
> > +                       leaf->ecx |= BIT(4);
> > +               break;
> > +       case 0xB:
> > +               leaf_hv.subfn = 0;
> > +               snp_cpuid_hv(&leaf_hv);
> > +
> > +               /* extended APIC ID */
> > +               leaf->edx = leaf_hv.edx;
> > +               break;
> > +       case 0xD: {
> > +               bool compacted = false;
> > +               u64 xcr0 = 1, xss = 0;
> > +               u32 xsave_size;
> > +
> > +               if (leaf->subfn != 0 && leaf->subfn != 1)
> > +                       return 0;
> > +
> > +               if (native_read_cr4() & X86_CR4_OSXSAVE)
> > +                       xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
> > +               if (leaf->subfn == 1) {
> > +                       /* Get XSS value if XSAVES is enabled. */
> > +                       if (leaf->eax & BIT(3)) {
> > +                               unsigned long lo, hi;
> > +
> > +                               asm volatile("rdmsr" : "=a" (lo), "=d" (hi)
> > +                                                    : "c" (MSR_IA32_XSS));
> > +                               xss = (hi << 32) | lo;
> > +                       }
> > +
> > +                       /*
> > +                        * The PPR and APM aren't clear on what size should be
> > +                        * encoded in 0xD:0x1:EBX when compaction is not enabled
> > +                        * by either XSAVEC (feature bit 1) or XSAVES (feature
> > +                        * bit 3) since SNP-capable hardware has these feature
> > +                        * bits fixed as 1. KVM sets it to 0 in this case, but
> > +                        * to avoid this becoming an issue it's safer to simply
> > +                        * treat this as unsupported for SNP guests.
> > +                        */
> > +                       if (!(leaf->eax & (BIT(1) | BIT(3))))
> > +                               return -EINVAL;
> 
> I couldn't get this patch set to boot and I found that I was setting
> these XSAVE cpuid bits wrong. This took me a while to debug because
> inside of handle_vc_boot_ghcb() this -EINVAL means we jump into the
> halt loop, in addition the early_printk()s inside of that function
> don't seem to  be working for me but should the halt in
> handle_vc_boot_ghcb() be replaced with an sev_es_terminate() or
> something?

For consistency, the error is propagated up the stack the same way as with
all other individual handlers, and it's up to the current #VC handler
function how it wants to handle errors. The other #VC handlers terminate,
but this one has used a halt loop since its initial implementation in 2020
(1aa9aa8ee517e).

Joerg, do you have more background on that? Would it make sense, outside
of this series, to change it to a terminate? Maybe with a specific set
of error codes for ES_{OK,UNSUPPORTED,VMM_ERROR,DECODE_FAILED}?

> 
> I am still working on why the early_printk()s in that function are not
> working, it seems that they lead to a different halt.

I don't see a different halt. They just don't seem to print anything.
(keep in mind you still need to advance the IP or else the guest is
still gonna end up spinning here, even if you're removing the halt loop
for testing purposes)

> working, it seems that they lead to a different halt. Have you tested
> any of those error paths manually? For example if you set your CPUID
> bits to explicitly fail here do you see the expected printks?

I think at that point in the code, when the XSAVE stuff is setup, the
console hasn't been enabled yet, so messages would get buffered until they
get flushed later (which won't happen since there's halt loop after). I
know in some cases devs will dump the log buffer from memory instead to get
at the error messages for early failures. (Maybe that's also why Joerg
decided to use a halt loop there instead of terminating?)

That said, I did some testing to confirm with earlyprintk=serial|vga and
I don't see the error messages even if I modify the #VC handler to allow
booting to continue. pr_err() messages however do show up if I drop the
halt loop. So maybe pr_err() is more appropriate here? But it doesn't
really matter unless you plan on digging into guest memory for the logs.

So maybe reworking the error handling in handle_vc_boot_ghcb() to use
sev_es_terminate() might be warranted, but probably worth checking with
Joerg first, and should be done as a separate series since it is not
SNP-related.