On 02.10.19 23:37, Qian Cai wrote: > On Tue, 2019-09-24 at 16:36 +0200, David Hildenbrand wrote: >> Since commit 3f906ba23689 ("mm/memory-hotplug: switch locking to a percpu >> rwsem") we do a cpus_read_lock() in mem_hotplug_begin(). This was >> introduced to fix a potential deadlock between get_online_mems() and >> get_online_cpus() - the memory and cpu hotplug lock. The root issue was >> that build_all_zonelists() -> stop_machine() required the cpu hotplug lock: >> The reason is that memory hotplug takes the memory hotplug lock and >> then calls stop_machine() which calls get_online_cpus(). That's the >> reverse lock order to get_online_cpus(); get_online_mems(); in >> mm/slub_common.c >> >> So memory hotplug never really required any cpu lock itself, only >> stop_machine() and lru_add_drain_all() required it. Back then, >> stop_machine_cpuslocked() and lru_add_drain_all_cpuslocked() were used >> as the cpu hotplug lock was now obtained in the caller. >> >> Since commit 11cd8638c37f ("mm, page_alloc: remove stop_machine from build >> all_zonelists"), the stop_machine_cpuslocked() call is gone. >> build_all_zonelists() does no longer require the cpu lock and does no >> longer make use of stop_machine(). >> >> Since commit 9852a7212324 ("mm: drop hotplug lock from >> lru_add_drain_all()"), lru_add_drain_all() "Doesn't need any cpu hotplug >> locking because we do rely on per-cpu kworkers being shut down before our >> page_alloc_cpu_dead callback is executed on the offlined cpu.". The >> lru_add_drain_all_cpuslocked() variant was removed. >> >> So there is nothing left that requires the cpu hotplug lock. The memory >> hotplug lock and the device hotplug lock are sufficient. > > Actually, powerpc does, > > arch_add_memory() > resize_hpt_for_hotplug() > pseries_lpar_resize_hpt() > stop_machine_cpuslocked() > Thanks for that observation. This will need some further thought. Another proof that locking is messed up :) Maybe we should start decoupling locking of the memory onlining/offlining path (e.g., get_online_mems()) from the memory adding/removing path (e.g., later something like get_present_mems()). Then we can push down the cpu hotplug lock to the PPC path and use stop_machine() directly. Time to document properly which lock protects what :) -- Thanks, David / dhildenb