On 12/7/2024 2:41 AM, Edgecombe, Rick P wrote:
On Fri, 2024-12-06 at 10:42 +0800, Xiaoyao Li wrote:
# Interaction with TDX_FEATURES0.VE_REDUCTION
TDX introduces a new feature VE_REDUCTION[2]. From the perspective of
host VMM, VE_REDUCTION turns several CPUID bits from fixed1 to
configurable, e.g., MTRR, MCA, MCE, etc. However, from the perspective
of TD guest, it’s an opt-in feature. The actual value seen by TD guest
depends on multiple factors: 1). If TD guest enables REDUCE_VE in
TDCS.TD_CTLS, 2) TDCS.FEATURE_PARAVIRT_CTRL, 3) CPUID value configured
by host VMM via TD_PARAMS.CPUID_CONFIG[]. (Please refer to latest TDX
1.5 spec for more details.)
Since host VMM has no idea on the setting of 1) and 2) when creating the
TD. We make the design to treat them as configurable bits and the global
metadata interface doesn’t report them as fixed1 bits for simplicity.
Host VMM must be aware itself that the value of these VE_REDUCTION
related CPUID bits might not be what it configures. The actual value
seen by TD guest also depends on the guest enabling and configuration of
VE_REDUCTION.
As we've been working on this, I've started to wonder whether this is a halfway
solution that is not worth it. Today there are directly configurable bits,
XFAM/attribute controlled bits, other opt-ins (like #VE reduction). And this has
only gotten more complicated as time has gone on.
If we really want to fully solve the problem of userspace understanding which
configurations are possible, the TDX module would almost need to expose some
sort of CPUID logic DSL that could be used to evaluate user configuration.
On the other extreme we could just say, this kind of logic is just going to need
to be hand coded somewhere, like is currently done in the QEMU patches.
I think hand coded some specific handling for special case is acceptable
when it's unavoidable. However, an auto-adaptive interface for general
cases is always better than hand code/hard code something. E.g., current
QEMU implementation hardcodes the fixed0 and fixed1 information based on
TDX 1.5.06 spec. When different versions of TDX module have different
fixed0 and fixed1 information, QEMU will needs interface to get the
version of TDX module and maintain different information for each
version of TDX module. It's a disaster IMHO.
The solution in this proposal decreases the work the VMM has to do, but in the
long term won't remove hand coding completely. As long as we are designing
something, what kind of bar should we target?
For this specific #VE reduction case, I think userspace doesn't need to
do any hand coding. Userspace just treats the bits related to #VE
reduction as configurable as reported by TDX module/KVM. And userspace
doesn't care if the value seen by TD guest is matched with what gets
configured by it because they are out of control of userspace.