On Mon, Oct 18, 2021 at 09:18:13PM +0200, Borislav Petkov wrote: > On Mon, Oct 18, 2021 at 01:40:03PM -0500, Michael Roth wrote: > > If CPUID has lied, that would result in a #GP, rather than a controlled > > termination in the various checkers/callers. The latter is easier to > > debug. > > > > Additionally, #VC is arguably a better indicator of SEV MSR availability > > for SEV-ES/SEV-SNP guests, since it is only generated by ES/SNP hardware > > and doesn't rely directly on hypervisor/EFI-provided CPUID values. It > > doesn't work for SEV guests, but I don't think it's a bad idea to allow > > SEV-ES/SEV-SNP guests to initialize sev_status in #VC handler to make > > use of the added assurance. > [Sorry for the wall of text, just trying to work through everything.] > Ok, let's take a step back and analyze what we're trying to solve first. > So I'm looking at sme_enable(): I'm not sure if this is pertaining to using the CPUID table prior to sme_enable(), or just the #VC-based SEV MSR read. The following comments assume the former. If that assumption is wrong you can basically ignore the rest of this email :) [The #VC-based SEV MSR read is not necessary for anything in sme_enable(), it's simply a way to determine whether the guest is an SNP guest, without any reliance on CPUID, which seemed useful in the context of doing some additional sanity checks against the SNP CPUID table and determining that it's appropriate to use it early on (rather than just trust that this is an SNP guest by virtue of the CC blob being present, and then failing later once sme_enable() checks for the SNP feature bits through the normal mechanism, as was done in v5).] > > 1. Code checks SME/SEV support leaf. HV lies and says there's none. So > guest doesn't boot encrypted. Oh well, not a big deal, the cloud vendor > won't be able to give confidentiality to its users => users go away or > do unencrypted like now. > > Problem is solved by political and economical pressure. > > 2. Check SEV and SME bit. HV lies here. Oh well, same as the above. I'd be worried about the possibility that, through some additional exploits or failures in the attestation flow, a guest owner was tricked into booting unencrypted on a compromised host and exposing their secrets. Their attestation process might even do some additional CPUID sanity checks, which would at the point be via the SNP CPUID table and look legitimate, unaware that the kernel didn't actually use the SNP CPUID table until after 0x8000001F was parsed (if we were to only initialize it after/as-part-of sme_enable()). Fortunately in this scenario I think the guest kernel actually would fail to boot due to the SNP hardware unconditionally treating code/page tables as encrypted pages. I tested some of these scenarios just to check, but not all, and I still don't feel confident enough about it to say that there's not some way to exploit this by someone who is more clever/persistant than me. > > 3. HV lies about 1. and 2. but says that SME/SEV is supported. > > Guest attempts to read the MSR Guest explodes due to the #GP. The same > political/economical pressure thing happens. That's seems likely, but maybe some future hardware bug, or some other exploit, makes it possible to intercept that MSR read? I don't know, but if that particular branch of execution can be made less likely by utilizing SNP CPUID validation I think it makes sense to make use of it. > > If the MSR is really there, we've landed at the place where we read the > SEV MSR. Moment of truth - SEV/SNP guests have a communication protocol > which is independent from the HV and all good. At which point we then switch to using the CPUID table? But at that point all the previous CPUID checks, both SEV-related/non-SEV-related, are now possibly not consistent with what's in the CPUID table. Do we then revalidate? Even a non-malicious hypervisor might provide inconsistent values between the two sources due to bugs, or SNP validation suppressing certain feature bits that hypervisor otherwise exposes, etc. Now all the code after sme_enable() can potentially take unexpected execution paths, where post-sme_enable() code makes assumptions about pre-sme_enable() checks that may no longer hold true. Also, it would be useful from an attestation perspective that the CPUID bits visible to userspace correspond to what the kernel used during boot, which wouldn't necessarily be the case if hypervisor-provided values were used during early boot and potentially put the kernel into some unexpected state that could persist beyond the point of attestation. Code-wise, thanks in large part to your suggestions, it really isn't all that much more complicated to hook in the CPUID table lookup in the #VC handlers (which are already needed anyway for SEV-ES) early on so all these checks are against the same trusted (or more-trusted at least) CPUID source. > > Now, which case am I missing here which justifies the need to do those > acrobatics of causing #VCs just to detect the SEV MSR? There are a few more places where cpuid is utilized prior to sme_enable(): # In boot/compressed paging_prepare(): ecx = cpuid(7, 0) # SNP-verified against host values # check ecx for LA57 # In boot/compressed and kernel proper verify_cpu(): eax, ebx, ecx, edx = cpuid(0, 0) # SNP-verified against host values # check eax for range > 0 # check ebx, ecx, edx for "AuthenticAMD" or "GenuineIntel" if_amd: edx = cpuid(1, 0) # SNP-verified against host values # check edx feature bits against REQUIRED_MASK0 (PAE|FPU|PSE|etc.) eax = cpuid(0x80000001, 0) # SNP-verified against host values # check eax against REQUIRED_MASK1 (LM|3DNOW) edx = cpuid(1, 0) # SNP-verified against host values # check eax against SSE_MASK # if not set, try to force it on via MSR_K7_HWCR if this is an AMD CPU # if forcing fails, report no_longmode available if_intel: # completely different stuff It's possible that various lies about the values checked for in REQUIRED_MASK0/REQUIRED_MASK1, LA57 enablement, etc., can be audited in similar fashion as you've done above to find nothing concerning, but what about 5 years from now? And these are all checks/configuration that can put the kernel in unexpected states that persist beyond the point of attestation, where we really need to care about the possible effects. If SNP CPUID validation isn't utilized until after-the-fact, we'd end up not utilizing it for some of the more 'interesting' CPUID bits. It's also worth noting that TDX guards against most of this through CPUID virtualization, where hardware/microcode provides similar validation for these sorts of CPUID bits in early boot. It's only because the SEV-SNP CPUID 'virtualization' lives in the guest code that we have to deal with the additional complexity of initializing the CPUID table early on. But if both platforms are capable of providing similar assurances then it seems worthwhile to pursue that. > acrobatics of causing #VCs just to detect the SEV MSR? The CPUID calls in snp_cpuid_init() weren't added specifically to induce the #VC-based SEV MSR read, they were added only because I thought the gist of your earlier suggestions were to do more validation against the CPUID table advertised by EFI rather than the v5 approach of deferring them till later after sev_status gets set by sme_enable(), and then failing later if turned out that SNP CPUID feature bit wasn't set by sme_enable(). I thought this was motivated by a desire to be more paranoid about what EFI provides, so once I had the cpuid checks added in snp_cpuid_init() it seemed like a logical step to further sanity-check the SNP CPUID bit using a mechanism that was completely independent of the CPUID table. I'm not dead set on that at all however, it was only added based on me [mis-]interpreting your comments as a desire to be less trusting of EFI as to whether this is an SNP guest or not. But I also don't think it's a good idea to not utilize the CPUID table until after sme_enable(), based on the above reasons. What if we simply add a check in sme_enable() that terminates the guest if the cc_blob/cpuid table is provided, but the CPUID/MSR checks in sme_enable() determine that this isn't an SNP guest? That would be similar to the v5 approach, but in a less roundabout way, and then the cpuid/MSR checks could be dropped from snp_cpuid_init(). If we did decide that it is useful to use #VC-based initialization of sev_status there, it would all be self-contained there (but again, that's a separate thing that I don't have a strong opinion on). Thanks, Mike > > Thx. > > -- > Regards/Gruss, > Boris. > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7Cddb8c27d71794244176308d9926c094d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637701815061371715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LUArcQqb%2F0WFlworDbZOClXhgYqEGje364fpBycixUg%3D&reserved=0