On 11/30/21 15:05, Thomas Gleixner wrote:
Why is this hotplug callback in the CPU starting section to begin with?
Just because the old notifier implementation used CPU_STARTING - in fact the commit messages say that CPU_STARTING was added partly *for* KVM (commit e545a6140b69, "kernel/cpu.c: create a CPU_STARTING cpu_chain notifier", 2008-09-08).
If you stick it into the online section which runs on the hotplugged CPU in thread context: CPUHP_AP_ONLINE_IDLE, --> CPUHP_AP_KVM_STARTING, CPUHP_AP_SCHED_WAIT_EMPTY, then it is allowed to fail and it still works in the right way.
Yes, moving it to the online section should be fine; it wouldn't solve the TDX problem however. Failure would rollback the hotplug and forbid hotplug altogether when TDX is loaded, which is not acceptable.
Paolo
When onlining a CPU then there cannot be any vCPU task run on the CPU at that point. When offlining a CPU then it's guaranteed that all user tasks and non-pinned kernel tasks have left the CPU, i.e. there cannot be a vCPU task around either.