On Mon, Jan 25, 2021 at 04:34:23PM -0800, Dave Hansen wrote: > > From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > Reclaim-based migration is attempting to optimize data placement in > memory based on the system topology. If the system changes, so must > the migration ordering. > > The implementation here is pretty simple and entirely unoptimized. On > any memory or CPU hotplug events, assume that a node was added or > removed and recalculate all migration targets. This ensures that the > node_demotion[] array is always ready to be used in case the new > reclaim mode is enabled. > > This recalculation is far from optimal, most glaringly that it does > not even attempt to figure out if nodes are actually coming or going. > But, given the expected paucity of hotplug events, this should be > fine. > > Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Huang Ying <ying.huang@xxxxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: osalvador <osalvador@xxxxxxx> > --- [...] > + > +/* > + * React to hotplug events that might affect the migration targes > + * like events that online or offline NUMA nodes. > + * > + * The ordering is also currently dependent on which nodes have > + * CPUs. That means we need CPU on/offline notification too. > + */ > +static int migration_online_cpu(unsigned int cpu) > +{ > + set_migration_target_nodes(); > + return 0; > +} > + > +static int migration_offline_cpu(unsigned int cpu) > +{ > + set_migration_target_nodes(); > + return 0; > +} > + > +/* > + * This leaves migrate-on-reclaim transiently disabled > + * between the MEM_GOING_OFFLINE and MEM_OFFLINE events. > + * This runs reclaim-based micgration is enabled or not. > + * This ensures that the user can turn reclaim-based > + * migration at any time without needing to recalcuate > + * migration targets. > + * > + * These callbacks already hold get_online_mems(). That > + * is why __set_migration_target_nodes() can be used as > + * opposed to set_migration_target_nodes(). > + */ > +static int __meminit migrate_on_reclaim_callback(struct notifier_block *self, > + unsigned long action, void *arg) > +{ > + switch (action) { > + case MEM_GOING_OFFLINE: > + /* > + * Make sure there are not transient states where > + * an offline node is a migration target. This > + * will leave migration disabled until the offline > + * completes and the MEM_OFFLINE case below runs. > + */ > + disable_all_migrate_targets(); > + break; > + case MEM_OFFLINE: > + case MEM_ONLINE: > + /* > + * Recalculate the target nodes once the node > + * reaches its final state (online or offline). > + */ > + __set_migration_target_nodes(); > + break; > + case MEM_CANCEL_OFFLINE: > + /* > + * MEM_GOING_OFFLINE disabled all the migration > + * targets. Reenable them. > + */ > + __set_migration_target_nodes(); > + break; > + case MEM_GOING_ONLINE: > + case MEM_CANCEL_ONLINE: > + break; > + } > + > + return notifier_from_errno(0); > +} This looks good, and I kinda like it. But in this case, all we care about is whether NUMA node does or does not have memory, so we have to remove/added into the demotion list. So, would make more sense to have a kinda helper in node_states_{set,clear}_node that calls the respective functions (disable_all_migrate_targets and __set_migration_target_nodes)? -- Oscar Salvador SUSE L3