On Tue, Oct 29 2024 at 14:05, Costa Shulyupin wrote: > index afc920116d42..44c7da0e1b8d 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -171,7 +171,7 @@ static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step) > * > * Return: %0 on success or a negative errno code > */ > -static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state, > +int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state, > bool bringup, struct hlist_node *node, > struct hlist_node **lastp) This is deep internal functionality of cpu hotplug and only valid when the hotplug lock is write held or if it is read held _and_ the state mutex is held. Otherwise it is completely unprotected against a concurrent state or instance insertion/removal and concurrent invocations of this function. And no, we are not going to expose the state mutex just because. CPU hotplug is complex enough already and we really don't need more side channels into it. There is another issue with this approach in general: 1) The 3 block states are just the tip of the iceberg. You are going to play a whack a mole game to add other subsystems/drivers as well. 2) The whole logic has ordering constraints. The states have strict ordering for a reason. So what guarantees that e.g. BLK_MQ_ONLINE has no dependencies on non BLK related states to be invoked before that. I'm failing to see the analysis of correctness here. Just because it did not explode right away does not make it correct. We've had enough subtle problems with ordering and dependencies in the past. No need to introduce new ones. CPU hotplug solves this problem without any hackery. Take a CPU offline, change the mask of that CPU and bring it online again. Repeat until all CPU changes are done. If some user space component cannot deal with that, then fix that instead of inflicting fragile and unmaintainable complexity on the kernel. That kubernetes problem is known since 2018 and nobody has actually sat down and solved it. Now we waste another 6 years to make it "work" in the kernel magically. This needs userspace awareness anyway. If you isolate a CPU then tasks or containers which are assigned to that CPU need to move away and the container has to exclude that CPU. If you remove the isolation then what is opening the CPU for existing containers magically? I'm not buying any of this "will" just work and nobody notices handwaving. Thanks, tglx