On Wed, 2024-11-06 at 07:01 -0800, Sean Christopherson wrote: > On Wed, Nov 06, 2024, Kai Huang wrote: > > On Thu, 2024-10-31 at 13:22 -0700, Sean Christopherson wrote: > > > On Thu, Oct 31, 2024, Kai Huang wrote: > > > > On Wed, 2024-10-30 at 08:19 -0700, Sean Christopherson wrote: > > > > > > +void __init tdx_bringup(void) > > > > > > +{ > > > > > > + enable_tdx = enable_tdx && !__tdx_bringup(); > > > > > > > > > > Ah. I don't love this approach because it mixes "failure" due to an unsupported > > > > > configuration, with failure due to unexpected issues. E.g. if enabling virtualization > > > > > fails, loading KVM-the-module absolutely should fail too, not simply disable TDX. > > > > > > > > Thanks for the comments. > > > > > > > > I see your point. However for "enabling virtualization failure" kvm_init() will > > > > also try to do (default behaviour), so if it fails it will result in module > > > > loading failure eventually. So while I guess it would be slightly better to > > > > make module loading fail if "enabling virtualization fails" in TDX, it is a nit > > > > issue to me. > > > > > > > > I think "enabling virtualization failure" is the only "unexpected issue" that > > > > should result in module loading failure. For any other TDX-specific > > > > initialization failure (e.g., any memory allocation in future patches) it's > > > > better to only disable TDX. > > > > > > I disagree. The platform owner wants TDX to be enabled, KVM shouldn't silently > > > disable TDX because of a transient, unrelated failure. > > > > > > If TDX _can't_ be supported, e.g. because EPT or MMIO SPTE caching was explicitly > > > disable, then that's different. And that's the general pattern throughout KVM. > > > If a requested feature isn't supported, then KVM continues on updates the module > > > param accordingly. But if something outright fails during setup, KVM aborts the > > > entire sequence. > > > > > > > So I can change to "make loading KVM-the-module fail if enabling virtualization > > > > fails in TDX", but I want to confirm this is what you want? > > > > > > I would prefer the logic to be: reject loading kvm-intel.ko if an operation that > > > would normally succeed, fails. > > > > I looked at the final tdx.c that in our development branch [*], and below is the > > list of the things that need to be done to init TDX (the code in > > __tdx_bringup()), and my thinking of whether to fail loading the module or just > > disable TDX: > > > > 1) Early dependency check fails. Those include: tdp_mmu_enabled, > > enable_mmio_caching, X86_FEATURE_MOVDIR64B check and check the presence of > > TSX_CTL uret MSR. > > > > For those we can disable TDX only but continue to load module. > > > > 2) Enable virtualization fails. > > > > For this we fail to load module (as you suggested). > > > > 3) Fail to register TDX cpuhp to do tdx_cpu_enable() and handle cpu hotplug. > > > > For this we only disable TDX but continue to load module. The reason is I think > > this is similar to enable a specific KVM feature but the hardware doesn't > > support it. We can go further to check the return value of tdx_cpu_enable() to > > distinguish cases like "module not loaded" and "unexpected error", but I really > > don't want to go that far. > > Hrm, tdx_cpu_enable() is a bit of a mess. Ideally, there would be a separate > "probe" API so that KVM could detect if TDX is supported. Though maybe it's the > TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to > detect whether or not a module is loaded. We can also use P-SEAMLDR SEAMCALL to query, but I see no difference between using TDH_SYS_INIT. If you are asking whether there's CPUID or MSR to query then no. > > So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if > it returns -ENODEV, otherwise fail the module load? We can, but we need to assume cpuhp_setup_state_cpuslocked() itself will not return -ENODEV (it is true now), otherwise we won't be able to distinguish whether the -ENODEV was from cpuhp_setup_state_cpuslocked() or tdx_cpu_enable(). Unless we choose to do tdx_cpu_enable() via on_each_cpu() separately. Btw tdx_cpu_enable() itself will print "module not loaded" in case of -ENODEV, so the user will be aware anyway if we only disable TDX but not fail module loading. My concern is still the whole "different handling of error cases" seems over- engineering. > > > 4) tdx_enable() fails. > > > > Ditto to 3). > > No, this should fail the module load. E.g. most of the error conditions are > -ENOMEM, which has nothing to do with host support for TDX. > > > 5) tdx_get_sysinfo() fails. > > > > This is a kernel bug since tdx_get_sysinfo() should always return valid TDX > > sysinfo structure pointer after tdx_enable() is done successfully. Currently we > > just WARN() if the returned pointer is NULL and disable TDX only. I think it's > > also fine. > > > > 6) TDX global metadata check fails, e.g., MAX_VCPUS etc. > > > > Ditto to 3). For this we disable TDX only. > > Where is this code? Please check: https://github.com/intel/tdx/blob/tdx_kvm_dev-2024-10-25.1-host-metadata-v6-rebase/arch/x86/kvm/vmx/tdx.c .. starting at line 3320. If you want individual commits, here's the list: KVM: TDX: Get TDX global Information https://github.com/intel/tdx/commit/6ae3ab1ddb51a4cf0f0810853a24d47d360abaea KVM: TDX: Get system-wide info about TDX module on initialization https://github.com/intel/tdx/commit/fd7947118b76f6d4256bc4228e03e73262e67ba2 KVM: TDX: Support per-VM KVM_CAP_MAX_VCPUS extension check https://github.com/intel/tdx/commit/98162cf99ee728b97a0c9647bd2b39a254da6a4a