On Thu 09-02-17 11:22:49, Cristopher Lameter wrote: > On Thu, 9 Feb 2017, Thomas Gleixner wrote: > > > You are just not getting it, really. > > > > The problem is that this for_each_online_cpu() is racy against a concurrent > > hot unplug and therefor can queue stuff for a not longer online cpu. That's > > what the mm folks tried to avoid by preventing a CPU hotplug operation > > before entering that loop. > > With a stop machine action it is NOT racy because the machine goes into a > special kernel state that guarantees that key operating system structures > are not touched. See mm/page_alloc.c's use of that characteristic to build > zonelists. Thus it cannot be executing for_each_online_cpu and related > tasks (unless one does not disable preempt .... but that is a given if a > spinlock has been taken).. Christoph, you are completely ignoring the reality and the code. There is no need for stop_machine nor it is helping anything. As the matter of fact there is a synchronization with the cpu hotplug needed if you want to make a per-cpu specific operations. get_online_cpus is the most straightforward and heavy weight way to do this synchronization but not the only one. As the patch [1] describes we do not really need get_online_cpus in drain_all_pages because we can do _better_. But this is not in any way a generic thing applicable to other code paths. If you disagree then you are free to post patches but hand waving you are doing here is just wasting everybody's time. So please cut it here unless you have specific proposals to improve the current situation. Thanks! [1] http://lkml.kernel.org/r/20170207201950.20482-1-mhocko@xxxxxxxxxx -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>