Re: [ANNOUNCE] PUCK Notes - 2024.04.03 - TDX Upstreaming Strategy

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Thu, 11 Apr 2024 01:13:43 +0000

On Tue, 2024-04-09 at 09:26 -0700, Sean Christopherson wrote:
> > Haha, if this is the confusion, I see why you reacted that way to "JSON".
> > That would be quite the curious choice for a TDX module API.
> > 
> > So it is easy to convert it to a C struct and embed it in KVM. It's just not
> > that useful because it will not necessarily be valid for future TDX modules.
> 
> No, I don't want to embed anything in KVM, that's the exact same as hardcoding
> crud into KVM, which is what I want to avoid.  I want to be able to roll out a
> new TDX module with any kernel changes, and I want userspace to be able to
> assert
> that, for a given TDX module, the effective guest CPUID configuration aligns
> with
> userspace's desired the vCPU model, i.e. that the value of fixed bits match up
> with the guest CPUID that userspace wants to define.
> 
> Maybe that just means converting the JSON file into some binary format that
> the
> kernel can already parse.  But I want Intel to commit to providing that
> metadata
> along with every TDX module.

Oof. It turns out in one of the JSON files there is a description of a different
interface (TDX module runtime interface) that provides a way to read CPUID data
that is configured in a TD, including fixed bits. It works like:
1. VMM queries which CPUID bits are directly configurable.
2. VMM provides directly configurable CPUID bits, along with XFAM and
ATTRIBUTES, via TDH.MNG.INIT. (KVM_TDX_INIT_VM)
3. Then VMM can use this other interface via TDH.MNG.RD, to query the resulting
values of specific CPUID leafs.

This does not provide a way to query the fixed bits specifically, it tells you
what ended up getting configuring in a specific TD, which includes the fixed
bits and anything else. So we need to do KVM_TDX_INIT_VM before KVM_SET_CPUID in
order to have something to check against. But there was discussion of
KVM_SET_CPUID on CPU0 having the CPUID state to pass to KVM_TDX_INIT_VM. So that
would need to be sorted.

If we pass the directly configurable values with KVM_TDX_INIT_VM, like we do
today, then the data provided by this interface should allow us to check
consistency between KVM_SET_CPUID and the actual configured TD CPUID behavior.
But we still need to think through what guarantees we need from the TDX module
to prevent TDX module changes from breaking userspace. For example if something
goes from fixed 1 to fixed 0, and the KVM_SET_CPUID call starts getting rejected
where it didn't before. Some of the TDX docs have a statement on how to help
this situation:

"
A properly written VMM should be able to handle the fact that more CPUID bits
become configurable. The host VMM should always consult the list of directly
configurable CPUID leaves and sub-leaves, as enumerated by 25 TDH.SYS.RD/RDALL
or TDH.SYS.INFO. If a CPUID bit is enumerated as configurable, and the VMM was
not designed to configure that bit, the VMM should set the configuration for
that bit to 1. If the host VMM neglects to configure CPUID bits that are
configurable, their virtual value (as seen by guest TDs) will be 0.
"

How this could translate to something sensible for the KVM API we are discussing
is not immediately obvious to me, but I need to think more on it.

We also need to verify a bit more on this method of reading CPUID data. So this
is just sort of a status update to share the direction we are heading.