On Fri, Mar 11, 2022 at 01:06:26PM +0800, Huang, Ying wrote: > Oscar Salvador <osalvador@xxxxxxx> writes: > > -static int __init migrate_on_reclaim_init(void) > > -{ > > - int ret; > > - > > node_demotion = kmalloc_array(nr_node_ids, > > sizeof(struct demotion_nodes), > > GFP_KERNEL); > > WARN_ON(!node_demotion); > > > > - ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline", > > - NULL, migration_offline_cpu); > > /* > > - * In the unlikely case that this fails, the automatic > > - * migration targets may become suboptimal for nodes > > - * where N_CPU changes. With such a small impact in a > > - * rare case, do not bother trying to do anything special. > > + * At this point, all numa nodes with memory/CPus have their state > > + * properly set, so we can build the demotion order now. > > */ > > - WARN_ON(ret < 0); > > - ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online", > > - migration_online_cpu, NULL); > > - WARN_ON(ret < 0); > > - > > + set_migration_target_nodes(); > > If my understanding were correct, we should enclose > set_migration_target_nodes() here with cpus_read_lock(). And add some > comment before set_migration_target_nodes() for this. I don't know > whether the locking order is right. Oh, I see that cpuhp_setup_state() holds the cpu-hotplug lock while calling in, so yeah, we might want to hold in there. The thing is, not long ago we found out that we could have ACPI events like memory-hotplug operations at boot stage [1], so I guess it is safe to assume we could also have cpu-hotplug operations at that stage as well, and so we want to hold cpus_read_lock() just to be on the safe side. But, unless I am missing something, that does not apply to set_migration_target_nodes() being called from a callback, as the callback (somewhere up the chain) already holds that lock. e.g: (_cpu_up takes cpus_write_lock()) and the same for the down operation. So, to sum it up, we only need the cpus_read_lock() in migrate_on_reclaim_init(). > > hotplug_memory_notifier(migrate_on_reclaim_callback, 100); > > And we should register the notifier before calling set_migration_target_nodes()? I cannot made my mind here. The primary reason I placed the call before registering the notifier is because the original code called set_migration_target_nodes() before doing so: <-- ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online", migration_online_cpu, NULL); WARN_ON(ret < 0); hotplug_memory_notifier(migrate_on_reclaim_callback, 100); --> I thought about following the same line. Why do you think it should be called afterwards? I am not really sure whether it has a different impact depending on the order. Note that memory-hotplug acpi events can happen at boot time, so by the time we register the memory_hotplug notifier, we can have some hotplug memory coming in, and so we call set_migration_target_nodes(). But that is fine, and I cannot see a difference shufling the order of them. Do you see a problem in there? [1] https://patchwork.kernel.org/project/linux-mm/patch/20200915094143.79181-3-ldufour@xxxxxxxxxxxxx/ -- Oscar Salvador SUSE Labs