On Thu, Oct 21, 2021 at 04:48:16PM +0200, Borislav Petkov wrote: > On Wed, Oct 20, 2021 at 11:10:23AM -0500, Michael Roth wrote: > > At which point we then switch to using the CPUID table? But at that > > point all the previous CPUID checks, both SEV-related/non-SEV-related, > > are now possibly not consistent with what's in the CPUID table. Do we > > then revalidate? > > Well, that's a tough question. That's basically the same question as, > does Linux support heterogeneous cores and can it handle hardware > features which get enabled after boot. The perfect example is, late > microcode loading which changes CPUID bits and adds new functionality. > > And the answer to that is, well, hard. You need to decide this on a > case-by-case basis. > > But isn't it that the SNP CPUID page will be parsed early enough anyway > so that kernel proper will see only SNP CPUID info and init properly > using that? At the time I wrote that I thought you were suggesting moving the SNP CPUID table initialization to where sme_enable() is in current upstream, so it seemed worth mentioning, but since the idea was actually to move all the sev_status initialization in sme_enable() earlier in the code to where SNP CPUID table init needs to happen (before first cpuid calls are made), I this scenario is avoided. > > > Even a non-malicious hypervisor might provide inconsistent values > > between the two sources due to bugs, or SNP validation suppressing > > certain feature bits that hypervisor otherwise exposes, etc. > > There's also migration, lemme point to a very recent example: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20211021104744.24126-1-jane.malalane%40citrix.com&data=04%7C01%7Cmichael.roth%40amd.com%7C4aaa998fcd134c8d054608d994a1d1aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637704245057316093%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OKO9o3YzKwRkyWPpam2%2Fxn4aRSMKtPEnZjn05g81SP8%3D&reserved=0 > > which is exactly what you say - a non-malicious HV taking care of its > migration pool. So how do you handle that? I concur with David's assessment on that solution being compatible with CPUID enforcement policy. But it's certainly something to consider more generally. Fortunately I think I misspoke earlier, I thought there was a case or 2 where bits were suppressed, rather than causing a validation failure, but looking back through the PPR I doesn't seem like that's actually the case. Which is good, since that would indeed be painful to deal with in the context of migration.