On 09/28, Peter Zijlstra wrote: > > On Sat, Sep 28, 2013 at 02:48:59PM +0200, Oleg Nesterov wrote: > > > Please note that this wait_event() adds a problem... it doesn't allow > > to "offload" the final synchronize_sched(). Suppose a 4k cpu machine > > does disable_nonboot_cpus(), we do not want 2 * 4k * synchronize_sched's > > in this case. We can solve this, but this wait_event() complicates > > the problem. > > That seems like a particularly easy fix; something like so? Yes, but... > @@ -586,6 +603,11 @@ int disable_nonboot_cpus(void) > > + cpu_hotplug_done(); > + > + for_each_cpu(cpu, frozen_cpus) > + cpu_notify_nofail(CPU_POST_DEAD_FROZEN, (void*)(long)cpu); This changes the protocol, I simply do not know if it is fine in general to do __cpu_down(another_cpu) without CPU_POST_DEAD(previous_cpu). Say, currently it is possible that CPU_DOWN_PREPARE takes some global lock released by CPU_DOWN_FAILED or CPU_POST_DEAD. Hmm. Now that workqueues do not use CPU_POST_DEAD, it has only 2 users, mce_cpu_callback() and cpufreq_cpu_callback() and the 1st one even ignores this notification if FROZEN. So yes, probably this is fine, but needs an ack from cpufreq maintainers (cc'ed), for example to ensure that it is fine to call __cpufreq_remove_dev_prepare() twice without _finish(). Oleg. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>