On Fri, Apr 26, 2024 at 12:21:46AM +0000, Huang, Kai wrote: > >> >> > > The important thing is that they're handled by _one_ entity. What we have today >> > > is probably the worst setup; VMXON is handled by KVM, but TDX.SYS.LP.INIT is >> > > handled by core kernel (sort of). >> > >> > I cannot argue against this :-) >> > >> > But from this point of view, I cannot see difference between tdx_enable() >> > and tdx_cpu_enable(), because they both in core-kernel while depend on KVM >> > to handle VMXON. >> >> My comments were made under the assumption that the code was NOT buggy, i.e. if >> KVM did NOT need to call tdx_cpu_enable() independent of tdx_enable(). >> >> That said, I do think it makes to have tdx_enable() call an private/inner version, >> e.g. __tdx_cpu_enable(), and then have KVM call a public version. Alternatively, >> the kernel could register yet another cpuhp hook that runs after KVM's, i.e. does >> TDX.SYS.LP.INIT after KVM has done VMXON (if TDX has been enabled). > >We will need to handle tdx_cpu_online() in "some cpuhp callback" anyway, >no matter whether tdx_enable() calls __tdx_cpu_enable() internally or not, >because now tdx_enable() can be done on a subset of cpus that the platform >has. Can you confirm this is allowed again? it seems like this code indicates the opposite: https://github.com/intel/tdx-module/blob/tdx_1.5/src/vmm_dispatcher/api_calls/tdh_sys_config.c#L768C1-L775C6 > >For the latter (after the "Alternatively" above), by "the kernel" do you >mean the core-kernel but not KVM? > >E.g., you mean to register a cpuhp book _inside_ tdx_enable() after TDX is >initialized successfully? > >That would have problem like when KVM is not present (e.g., KVM is >unloaded after it enables TDX), the cpuhp book won't work at all. Is "the cpuhp hook doesn't work if KVM is not loaded" a real problem? The CPU about to online won't run any TDX code. So, it should be ok to skip tdx_cpu_enable(). Don't get me wrong. I don't object to registering the cpuhp hook in KVM. I just want you to make decisions based on good information. > >If we ever want a new TDX-specific cpuhp hook "at this stage", IMHO it's >better to have it done by KVM, i.e., it goes away when KVM is unloaded. > >Logically, we have two approaches in terms of how to treat >tdx_cpu_enable(): > >1) We treat the two cases separately: calling tdx_cpu_enable() for all >online cpus, and calling it when a new CPU tries to go online in some >cpuhp hook. And we only want to call tdx_cpu_enable() in cpuhp book when >tdx_enable() has done successfully. > >That is: > >a) we always call tdx_cpu_enable() (or __tdx_cpu_enable()) inside >tdx_enable() as the first step, or, > >b) let the caller (KVM) to make sure of tdx_cpu_enable() has been done for >all online cpus before calling tdx_enable(). > >Something like this: > > if (enable_tdx) { > cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu, > ...); > > cpus_read_lock(); > on_each_cpu(tdx_cpu_enable, ...); /* or do it inside > * in tdx_enable() */ > enable_tdx = tdx_enable(); > if (enable_tdx) > cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, > tdx_online_cpu, ...); > cpus_read_unlock(); > } > > static int tdx_online_cpu(unsigned int cpu) > { > unsigned long flags; > int ret; > > if (!enable_tdx) > return 0; > > local_irq_save(flags); > ret = tdx_cpu_enable(); > local_irq_restore(flags); > > return ret; > } > >2) We treat tdx_cpu_enable() as a whole by viewing it as the first step to >run any TDX code (SEAMCALL) on any cpu, including the SEAMCALLs involved >in tdx_enable(). > >That is, we *unconditionally* call tdx_cpu_enable() for all online cpus, >and when a new CPU tries to go online. > >This can be handled at once if we do tdx_cpu_enable() inside KVM's cpuhp >hook: > > static int vt_hardware_enable(unsigned int cpu) > { > vmx_hardware_enable(); > > local_irq_save(flags); > ret = tdx_cpu_enable(); > local_irq_restore(flags); > > /* > * -ENODEV means TDX is not supported by the platform > * (TDX not enabled by the hardware or module is > * not loaded) or the kernel isn't built with TDX. > * > * Allow CPU to go online as there's no way kernel > * could use TDX in this case. > * > * Other error codes means TDX is available but something > * went wrong. Prevent this CPU to go online so that > * TDX may still work on other online CPUs. > */ > if (ret && ret != -ENODEV) > return ret; > > return ret; > } > >So with your change to always enable virtualization when TDX is enabled >during module load, we can simply have: > > if (enable_tdx) > cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu, > ...); > > cpus_read_lock(); > enable_tdx = tdx_enable(); > cpus_read_unlock(); > } > >So despite the cpus_read_lock() around tdx_enable() is a little bit silly, >the logic is actually simpler IMHO. > >(local_irq_save()/restore() around tdx_cpu_enable() is also silly but that >is a common problem to both above solution and can be changed >independently). > >Also, as I mentioned that the final goal is to have a TDX-specific CPUHP >hook in the core-kernel _BEFORE_ any in-kernel TDX user (KVM) to make sure >all online CPUs are TDX-capable. > >When that happens, I can just move the code in vt_hardware_enable() to >tdx_online_cpu() and do additional VMXOFF inside it, with the assumption >that the in-kernel TDX users should manage VMXON/VMXOFF on their own. >Then all TDX users can remove the handling of tdx_cpu_enable().