On Wed, Nov 06, 2024, Kai Huang wrote: > On Wed, 2024-11-06 at 07:01 -0800, Sean Christopherson wrote: > > On Wed, Nov 06, 2024, Kai Huang wrote: > > > For this we only disable TDX but continue to load module. The reason is I think > > > this is similar to enable a specific KVM feature but the hardware doesn't > > > support it. We can go further to check the return value of tdx_cpu_enable() to > > > distinguish cases like "module not loaded" and "unexpected error", but I really > > > don't want to go that far. > > > > Hrm, tdx_cpu_enable() is a bit of a mess. Ideally, there would be a separate > > "probe" API so that KVM could detect if TDX is supported. Though maybe it's the > > TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to > > detect whether or not a module is loaded. > > We can also use P-SEAMLDR SEAMCALL to query, but I see no difference between > using TDH_SYS_INIT. If you are asking whether there's CPUID or MSR to query > then no. Doesn't have to be a CPUID or MSR, anything idempotent would work. Which begs the question, is that P-SEAMLDR SEAMCALL query you have in mind idempotent? :-) > > So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if > > it returns -ENODEV, otherwise fail the module load? > > We can, but we need to assume cpuhp_setup_state_cpuslocked() itself will not > return -ENODEV (it is true now), otherwise we won't be able to distinguish > whether the -ENODEV was from cpuhp_setup_state_cpuslocked() or tdx_cpu_enable(). > > Unless we choose to do tdx_cpu_enable() via on_each_cpu() separately. > > Btw tdx_cpu_enable() itself will print "module not loaded" in case of -ENODEV, > so the user will be aware anyway if we only disable TDX but not fail module > loading. That only helps if a human looks at dmesg before attempting to run a TDX VM on the host, and parsing dmesg to treat that particular scenario as fatal isn't something I want to recommend to end users. E.g. if our platform configuration screwed up and failed to load a TDX module, then I want that to be surfaced as an alarm of sorts, not a silent "this platform doesn't support TDX" flag. > My concern is still the whole "different handling of error cases" seems over- > engineering. IMO, that's a symptom of the TDX enabling code not cleanly separating "probe" from "enable", and at a glance, that seems very solvable. And I suspect that cleaning things up will allow for additional hardening. E.g. I assume the lack of MOVDIR64B should be a WARN, but because KVM checks for MOVDIR64B before checking for basic TDX support, it's an non-commitalpr_warn(). > > > 4) tdx_enable() fails. > > > > > > Ditto to 3). > > > > No, this should fail the module load. E.g. most of the error conditions are > > -ENOMEM, which has nothing to do with host support for TDX. > > > > > 5) tdx_get_sysinfo() fails. > > > > > > This is a kernel bug since tdx_get_sysinfo() should always return valid TDX > > > sysinfo structure pointer after tdx_enable() is done successfully. Currently we > > > just WARN() if the returned pointer is NULL and disable TDX only. I think it's > > > also fine. > > > > > > 6) TDX global metadata check fails, e.g., MAX_VCPUS etc. > > > > > > Ditto to 3). For this we disable TDX only. > > > > Where is this code? > > Please check: > > https://github.com/intel/tdx/blob/tdx_kvm_dev-2024-10-25.1-host-metadata-v6-rebase/arch/x86/kvm/vmx/tdx.c > > .. starting at line 3320. Before I forget, that code has a bug. This /* Check TDX module and KVM capabilities */ if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) || !tdx_get_supported_xfam(&tdx_sysinfo->td_conf)) goto get_sysinfo_err; will return '0' on error, instead of -EINVAL (or whatever it intends). Back to the main discussion, these checks are obvious "probe" failures and so should disable TDX without failing module load: if (!tdp_mmu_enabled || !enable_mmio_caching) return -EOPNOTSUPP; if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) { pr_warn("MOVDIR64B is reqiured for TDX\n"); return -EOPNOTSUPP; } A kvm_find_user_return_msr() error is obviously a KVM bug, i.e. should definitely WARN and fail module module. Ditto for kvm_enable_virtualization(). The boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM) that's buried in tdx_enable() really belongs in KVM. Having it in both is totally fine, but KVM shouldn't do a bunch of work and _then_ check if all that work was pointless. I am ok treating everything at or after tdx_get_sysinfo() as fatal to module load, especially since, IIUC, TD_SYS_INIT can't be undone, i.e. KVM has crossed a point of no return. In short, assuming KVM can query if a TDX module is a loaded, I don't think it's all that much work to do: static bool kvm_is_tdx_supported(void) { if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM)) return false; if (!<is TDX module loaded>) return false; if (!tdp_mmu_enabled || !enable_mmio_caching) return false; if (WARN_ON_ONCE(!cpu_feature_enabled(X86_FEATURE_MOVDIR64B))) return false; return true; } int __init tdx_bringup(void) { enable_tdx = enable_tdx && kvm_is_tdx_supported(); if (!enable_tdx) return 0; return __tdx_bringup(); }