Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

Chao Gao <chao.gao@xxxxxxxxx> · Fri, 26 Apr 2024 11:21:25 +0800

On Fri, Apr 26, 2024 at 12:21:46AM +0000, Huang, Kai wrote:
>
>> 
>> > > The important thing is that they're handled by _one_ entity.  What we have today
>> > > is probably the worst setup; VMXON is handled by KVM, but TDX.SYS.LP.INIT is
>> > > handled by core kernel (sort of).
>> > 
>> > I cannot argue against this :-)
>> > 
>> > But from this point of view, I cannot see difference between tdx_enable()
>> > and tdx_cpu_enable(), because they both in core-kernel while depend on KVM
>> > to handle VMXON.
>> 
>> My comments were made under the assumption that the code was NOT buggy, i.e. if
>> KVM did NOT need to call tdx_cpu_enable() independent of tdx_enable().
>> 
>> That said, I do think it makes to have tdx_enable() call an private/inner version,
>> e.g. __tdx_cpu_enable(), and then have KVM call a public version.  Alternatively,
>> the kernel could register yet another cpuhp hook that runs after KVM's, i.e. does
>> TDX.SYS.LP.INIT after KVM has done VMXON (if TDX has been enabled).
>
>We will need to handle tdx_cpu_online() in "some cpuhp callback" anyway,
>no matter whether tdx_enable() calls __tdx_cpu_enable() internally or not,
>because now tdx_enable() can be done on a subset of cpus that the platform
>has.

Can you confirm this is allowed again? it seems like this code indicates the
opposite:

https://github.com/intel/tdx-module/blob/tdx_1.5/src/vmm_dispatcher/api_calls/tdh_sys_config.c#L768C1-L775C6

>
>For the latter (after the "Alternatively" above), by "the kernel" do you
>mean the core-kernel but not KVM?
>
>E.g., you mean to register a cpuhp book _inside_ tdx_enable() after TDX is
>initialized successfully?
>
>That would have problem like when KVM is not present (e.g., KVM is
>unloaded after it enables TDX), the cpuhp book won't work at all.

Is "the cpuhp hook doesn't work if KVM is not loaded" a real problem?

The CPU about to online won't run any TDX code. So, it should be ok to
skip tdx_cpu_enable().

Don't get me wrong. I don't object to registering the cpuhp hook in KVM.
I just want you to make decisions based on good information.

>
>If we ever want a new TDX-specific cpuhp hook "at this stage", IMHO it's
>better to have it done by KVM, i.e., it goes away when KVM is unloaded.
>
>Logically, we have two approaches in terms of how to treat
>tdx_cpu_enable():
>
>1) We treat the two cases separately: calling tdx_cpu_enable() for all
>online cpus, and calling it when a new CPU tries to go online in some
>cpuhp hook.  And we only want to call tdx_cpu_enable() in cpuhp book when
>tdx_enable() has done successfully.
>
>That is: 
>
>a) we always call tdx_cpu_enable() (or __tdx_cpu_enable()) inside
>tdx_enable() as the first step, or,
>
>b) let the caller (KVM) to make sure of tdx_cpu_enable() has been done for
>all online cpus before calling tdx_enable().
>
>Something like this:
>
>	if (enable_tdx) {
>		cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu, 
>			...);
>
>		cpus_read_lock();
>		on_each_cpu(tdx_cpu_enable, ...); /* or do it inside 
>						   * in tdx_enable() */
>		enable_tdx = tdx_enable();
>		if (enable_tdx)
>			cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>				tdx_online_cpu, ...);
>		cpus_read_unlock();
>	}
>
>	static int tdx_online_cpu(unsigned int cpu)
>	{
>		unsigned long flags;
>		int ret;
>
>		if (!enable_tdx)
>			return 0;
>
>		local_irq_save(flags);
>		ret = tdx_cpu_enable();
>		local_irq_restore(flags);
>
>		return ret;
>	}
>
>2) We treat tdx_cpu_enable() as a whole by viewing it as the first step to
>run any TDX code (SEAMCALL) on any cpu, including the SEAMCALLs involved
>in tdx_enable().
>
>That is, we *unconditionally* call tdx_cpu_enable() for all online cpus,
>and when a new CPU tries to go online.
>
>This can be handled at once if we do tdx_cpu_enable() inside KVM's cpuhp
>hook:
>
>	static int vt_hardware_enable(unsigned int cpu)
>	{
>		vmx_hardware_enable();
>
>		local_irq_save(flags);
>		ret = tdx_cpu_enable();
>		local_irq_restore(flags);
>
>		/*
>		 * -ENODEV means TDX is not supported by the platform
>		 * (TDX not enabled by the hardware or module is
>		 * not loaded) or the kernel isn't built with TDX.
>		 *
>		 * Allow CPU to go online as there's no way kernel
>		 * could use TDX in this case.
>		 *
>		 * Other error codes means TDX is available but something
>		 * went wrong.  Prevent this CPU to go online so that
>		 * TDX may still work on other online CPUs.
>		 */
>		if (ret && ret != -ENODEV)
>			return ret;
>
>		return ret;
>	}
>
>So with your change to always enable virtualization when TDX is enabled
>during module load, we can simply have:
>
>	if (enable_tdx)
>		cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu, 
>			...);
>
>		cpus_read_lock();
>		enable_tdx = tdx_enable();
>		cpus_read_unlock();
>	}
>
>So despite the cpus_read_lock() around tdx_enable() is a little bit silly,
>the logic is actually simpler IMHO.
>
>(local_irq_save()/restore() around tdx_cpu_enable() is also silly but that
>is a common problem to both above solution and can be changed
>independently).
>
>Also, as I mentioned that the final goal is to have a TDX-specific CPUHP
>hook in the core-kernel _BEFORE_ any in-kernel TDX user (KVM) to make sure
>all online CPUs are TDX-capable.  
>
>When that happens, I can just move the code in vt_hardware_enable() to
>tdx_online_cpu() and do additional VMXOFF inside it, with the assumption
>that the in-kernel TDX users should manage VMXON/VMXOFF on their own. 
>Then all TDX users can remove the handling of tdx_cpu_enable().