On Fri, Jul 07, 2023 at 10:57:12AM +0200, Borislav Petkov wrote: > On Fri, Jul 07, 2023 at 10:22:56AM +0200, Joerg Roedel wrote: > > On Fri, Jul 07, 2023 at 12:23:59PM +0800, Baoquan He wrote: > > > I am wondering why we don't detect the cpu type and return early inside > > > sev_enable() if it's Intel cpu. > > > > > > We can't rely on CONFIG_AMD_MEM_ENCRYPT to decide if the code need be > > > executed or not because we usually enable them all in distros. If the SETUP_CC_BLOB config table entry isn't present, then none of that code will run. I think the patch here addresses the main issue: kexec has handed us a SETUP_EFI setup_data entry, with a pointer to a valid config table, but we don't map it before accessing it. I should have done a better job of checking for this, since there has been similar cases exposed by the SEV checks, e.g.: commit b57feed2cc2622ae14b2fa62f19e973e5e0a60cf Author: Michael Roth <michael.roth@xxxxxxx> Date: Tue Jul 5 21:53:15 2022 -0500 x86/compressed/64: Add identity mappings for setup_data entries but I don't think there's anything inherently wrong with accessing the EFI config table in early boot. SEV is earliest user now, but something else can come along. We certainly have early access to EFI system table and other bootparams fields like ACPI RDSP in this neck of the woods. So regardless of how we decide to handle sev_enable(), I think this patch should be applied, rather than allowing this to lurk in the shadows until it claims its next victim. > > > > Looking at the code in head_64.S, by the time sev_enable() runs the SEV > > bit should already be set in sev_status. Maybe use that to detect > > whether SEV is enabled and bail out early? sev_enabled() is actually what sets sev_status global, but for startup32 path, the SEV_ENABLED bit is artificially set earlier to force the C-bit check in startup32_check_sev_cbit(): fef81c8626: x86/boot/compressed/64: Check SEV encryption in the 32-bit boot-path > > There was something about getting the CPUID page on SNP *before* > actually calling CPUID but this is not the first time we had trouble in > this area. This needs to be done differently. > > Michael? The main issue is we don't know when it is safe to access the SEV_STATUS MSR until we've checked for the CPUID bit that advertises SEV capability, otherwise we can generate a #GP on machines that don't support it. With SNP, the most reliable way to access this CPUID bit is via the SNP CPUID table entry corresponding to 0x8000001F, rather than the emulated values provided by hypervisor, like we do with SEV-ES. So that's how things are implemented currently. We *could* instead revert back to using the hypervisor-provided CPUID values for 0x8000001F, as with SEV-ES, which would allow us to check SEV_STATUS MSR *prior* to SNP CPUID table setup, instead of afterward... What's awkward about that approach is that we'd effectively be bypassing all the validation that the SNP firmware does on the 0x8000001F leaf. In general this seems backwards, however: a) we still have the option of re-checking the 0x8000001F leaf once the SNP CPUID table is setup to see if the hypervisor presented something different to the guest than what we were expecting b) in this *specific* case, we only need to check for SEV CPUID bit so we know it is safe to check SEV_STATUS MSR. A malicious hypervisor can: i) trick us into causing a crash via the MSR access: but that's okay, security isn't compromised. hypervisor's fault, not ours. ii) trick us into thinking SEV is enabled so hypervisor can intercept/emulated the MSR access: but that's okay, because when SEV is enabled the SEV_STATUS MSR can't be intercepted, and if SEV isn't actually enabled, the guest wouldn't get past attestation. iii) trick us into thinking SEV isn't enabled, which should be fine if we leave sev_status unset. So, yes, we can avoid checking for CC blob by relying on the above details and reversing the ordering of CPUID table setup and initial SEV_STATUS MSR checks. But we'd want to similarly audit any changes to these paths so that an eventual case doesn't pop up where a malicious hypervisor can cause a certain check/configuration to be bypassed by toying with CPUID values at this stage of boot, which is why we've so far tried to maintain the current handling. It would be unfortunate if we finally abandoned this path because of the issue being hit here though. I think the patch posted here is the proper resolution to the issue being hit, and I'm hoping at this point we've identified all the similar cases where EFI/setup_data-related structures were missing explicit mappings. But if we still think it's too much of a liability to access the EFI config table outside of SEV-enabled guests, then I can work on re-implementing things based on the above logic. -Mike > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette