Re: [PATCH 3/3] KVM: VMX: Initialize TDX during KVM module load

Sean Christopherson <seanjc@xxxxxxxxxx> · Thu, 7 Nov 2024 14:04:25 -0800

On Wed, Nov 06, 2024, Kai Huang wrote:
> On Wed, 2024-11-06 at 07:01 -0800, Sean Christopherson wrote:
> > On Wed, Nov 06, 2024, Kai Huang wrote:
> > > For this we only disable TDX but continue to load module.  The reason is I think
> > > this is similar to enable a specific KVM feature but the hardware doesn't
> > > support it.  We can go further to check the return value of tdx_cpu_enable() to
> > > distinguish cases like "module not loaded" and "unexpected error", but I really
> > > don't want to go that far.
> > 
> > Hrm, tdx_cpu_enable() is a bit of a mess.  Ideally, there would be a separate
> > "probe" API so that KVM could detect if TDX is supported.  Though maybe it's the
> > TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to
> > detect whether or not a module is loaded.
> 
> We can also use P-SEAMLDR SEAMCALL to query, but I see no difference between
> using TDH_SYS_INIT.  If you are asking whether there's CPUID or MSR to query
> then no.

Doesn't have to be a CPUID or MSR, anything idempotent would work.  Which begs
the question, is that P-SEAMLDR SEAMCALL query you have in mind idempotent? :-)

> > So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if
> > it returns -ENODEV, otherwise fail the module load?
> 
> We can, but we need to assume cpuhp_setup_state_cpuslocked() itself will not
> return -ENODEV (it is true now), otherwise we won't be able to distinguish
> whether the -ENODEV was from cpuhp_setup_state_cpuslocked() or tdx_cpu_enable().
> 
> Unless we choose to do tdx_cpu_enable() via on_each_cpu() separately.
> 
> Btw tdx_cpu_enable() itself will print "module not loaded" in case of -ENODEV,
> so the user will be aware anyway if we only disable TDX but not fail module
> loading.

That only helps if a human looks at dmesg before attempting to run a TDX VM on
the host, and parsing dmesg to treat that particular scenario as fatal isn't
something I want to recommend to end users.  E.g. if our platform configuration
screwed up and failed to load a TDX module, then I want that to be surfaced as
an alarm of sorts, not a silent "this platform doesn't support TDX" flag.

> My concern is still the whole "different handling of error cases" seems over-
> engineering.

IMO, that's a symptom of the TDX enabling code not cleanly separating "probe"
from "enable", and at a glance, that seems very solvable.  And I suspect that
cleaning things up will allow for additional hardening.  E.g. I assume the lack
of MOVDIR64B should be a WARN, but because KVM checks for MOVDIR64B before
checking for basic TDX support, it's an non-commitalpr_warn().

> > > 4) tdx_enable() fails.
> > > 
> > > Ditto to 3).
> > 
> > No, this should fail the module load.  E.g. most of the error conditions are
> > -ENOMEM, which has nothing to do with host support for TDX.
> > 
> > > 5) tdx_get_sysinfo() fails.
> > > 
> > > This is a kernel bug since tdx_get_sysinfo() should always return valid TDX
> > > sysinfo structure pointer after tdx_enable() is done successfully.  Currently we
> > > just WARN() if the returned pointer is NULL and disable TDX only.  I think it's
> > > also fine.
> > > 
> > > 6) TDX global metadata check fails, e.g., MAX_VCPUS etc.
> > > 
> > > Ditto to 3).  For this we disable TDX only.
> > 
> > Where is this code?
> 
> Please check:
> 
> https://github.com/intel/tdx/blob/tdx_kvm_dev-2024-10-25.1-host-metadata-v6-rebase/arch/x86/kvm/vmx/tdx.c
> 
> .. starting at line 3320.

Before I forget, that code has a bug.  This

	/* Check TDX module and KVM capabilities */
	if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) ||
	    !tdx_get_supported_xfam(&tdx_sysinfo->td_conf))
		goto get_sysinfo_err;

will return '0' on error, instead of -EINVAL (or whatever it intends).

Back to the main discussion, these checks are obvious "probe" failures and so
should disable TDX without failing module load:

	if (!tdp_mmu_enabled || !enable_mmio_caching)
		return -EOPNOTSUPP;

	if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) {
		pr_warn("MOVDIR64B is reqiured for TDX\n");
		return -EOPNOTSUPP;
	}

A kvm_find_user_return_msr() error is obviously a KVM bug, i.e. should definitely
WARN and fail module module.  Ditto for kvm_enable_virtualization().

The boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM) that's buried in tdx_enable()
really belongs in KVM.  Having it in both is totally fine, but KVM shouldn't do
a bunch of work and _then_ check if all that work was pointless.

I am ok treating everything at or after tdx_get_sysinfo() as fatal to module load,
especially since, IIUC, TD_SYS_INIT can't be undone, i.e. KVM has crossed a point
of no return.

In short, assuming KVM can query if a TDX module is a loaded, I don't think it's
all that much work to do:

  static bool kvm_is_tdx_supported(void)
  {
	if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
		return false;

	if (!<is TDX module loaded>)
		return false;

	if (!tdp_mmu_enabled || !enable_mmio_caching)
		return false;

	if (WARN_ON_ONCE(!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)))
		return false;

	return true;
  }

  int __init tdx_bringup(void)
  {
	enable_tdx = enable_tdx && kvm_is_tdx_supported();
	if (!enable_tdx)
		return 0;

	return __tdx_bringup();
  }