> On Sep 26, 2019, at 3:26 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > OK, this is using for_each_online_cpu but why is this a problem? Have > you checked what the code actually does? Let's say that online_pages is > racing with cpu hotplug. A new CPU appears/disappears from the online > mask while we are iterating it, right? Let's start with cpu offlining > case. We have two choices, either the cpu is still visible and we update > its local node configuration even though it will disappear shortly which > is ok because we are not touching any data that disappears (it's all > per-cpu). Case when the cpu is no longer there is not really > interesting. For the online case we might miss a cpu but that should be > tolerateable because that is not any different from triggering the > online independently of the memory hotplug. So there has to be a hook > from that code path as well. If there is none then this is buggy > irrespective of the locking. > > Makes sense? This sounds to me requires lots of audits and testing. Also, someone who is more familiar with CPU hotplug should review this patch. Personally, I am no fun of operating on an incorrect CPU mask to begin with, things could go wrong really quickly...