On Tue, Aug 13, 2024 at 06:16:10PM -0700, Sean Christopherson wrote: >On Wed, Aug 14, 2024, Chao Gao wrote: >> On Tue, Aug 13, 2024 at 11:14:31PM +0800, Xiaoyao Li wrote: >> >On 8/13/2024 7:34 PM, Chao Gao wrote: >> >> I think adding new fixed-1 bits is fine as long as they don't break KVM, i.e., >> >> KVM shouldn't need to take any action for the new fixed-1 bits, like >> >> saving/restoring more host CPU states across TD-enter/exit or emulating >> >> CPUID/MSR accesses from guests >> > >> >I disagree. Adding new fixed-1 bits in a newer TDX module can lead to a >> >different TD with same cpu model. >> >> The new TDX module simply doesn't support old CPU models. > >What happens if the new TDX module is needed to fix a security issue? Or if a >customer wants to support a heterogenous migration pool, and older (physical) >CPUs don't support the feature? Or if a customer wants to continue hosting >existing VM shapes on newer hardware? > >> QEMU can report an error and define a new CPU model that works with the TDX >> module. Sometimes, CPUs may drop features; > >Very, very rarely. And when it does happen, there are years of warning before >the features are dropped. > >> this may cause KVM to not support some features and in turn some old CPU >> models having those features cannot be supported. is it a requirement for >> TDX modules alone that old CPU models must always be supported? > >Not a hard requirement, but a pretty firm one. There needs to be sane, reasonable >behavior, or we're going to have problems. OK. So, the expectation is the TDX module should avoid adding new fixed-1 bits. I suppose this also applies to "native" CPUID bits, which are not configurable and simply reflected as native values to TDs. One scenario where "fixed-1" bits can help is: we discover a security issue and release a microcode update to expose a feature indicating which CPUs are vulnerable. if the TDX module allows the VMM to configure the feature as 0 (i.e., not vulnerable) on vulnerable CPUs, a TD might incorrectly assume it's not vulnerable, creating a security issue. I think in above case, the TDX module has to add a "fixed-1" bit. An example of such a feature is RRSBA in the IA32_ARCH_CAPABILITIES MSR.