On 01/10/2013 10:59 PM, Eric W. Biederman wrote: ... > disable_nonboot_cpus() should really be called > sometimes_dangerously_hotunplug_all_but_one_cpu(). > > If that code is going to be something other than power management > specific it is not cool that disable_nonboot_cpus() is not always > enabled when SMP is enabled. It means that architectures need to > implement two different ways of shutting down cpus. > > One of the truly nasty things about cpu_hotplug is that it requires that > irqs be migrated away from a cpu with interrupts disabled, which at > least on x86 in some interrupt delivery modes is impossible to do > safely. The only way to losslessly (and without wedging irq > controllers) in those interrupt delivery modes (needed for more than 8 > cpus) is to migrate an irq in it's irq handler. Which is fine for > setting /proc/irq/$N/smp_affinity but is useless for cpu hot-unplug, > where we need to guarantee that all irqs are going to stop hitting a > cpu. I'm a little confused about this; disable_nonboot_cpus() calls _cpu_down() to hot-unplug the CPU, and the regular CPU hotplug path is drivers/base/cpu.c:store_online() -> cpu_down() -> _cpu_down(), i.e. the same basic code. Given that, why is CPU hot unplug safe on x86 via sysfs (well, I'm just assuming it must be, since it's enabled in my regular distro kernel and appears to work fine), but not safe when it's done from disable_nonboot_cpus()? > Now sometimes_dangerously_hotunplug_all_but_one_cpu() on the reboot path > was added just a few months ago in Oct 2012, and it appears to due to > weird ARM maintainership. At least the x86 reboot_cpu_id option is > broken due to that addition. I guess this is true because reboot_cpu_id() isn't honored unless that CPU is already offline, and disable_nonboot_cpus() probably took it offline on average. Perhaps x86's native_machine_shutdown() should explicitly bring that CPU online if it's in cpu_possible_mask? Or perhaps disable_nonboot_cpus() should be enhanced with x86's reboot_cpu_id logic, so it simply works the same way across all architectures; that way some chunk of x86's native_machine_shutdown() could be made common. ... > We should remove disable_nonboot_cpus() from the reboot path. It is > still a crazy unmaintained cpu hotplug mess. Even if we updated it to standardize the reboot_cpu_id logic across all architectures, and hence made it still suitable for x86?