Hi Hyeonggon, thank you for the review! [...snip...] > Hi Joshua, thanks for the update! > It actually is what I was intended in the manual / auto mode description. > > I don't have a strong opinion on the weight of the hot-plugged NUMA node > in manual mode, as it's not ideal whatever weight we choose and the user > need to update the weight after hot-plug events anyway. I'm glad that I was able to correctly interpret the framework you laid out in the previous conversations. And yes -- I agree, I think no matter what value I choose, it will always be sub-optimal for some definition of optimality. I simply chose 1 because it is now the new smallest weight possible, since 0 no longer works. > Some comments inlined below: > > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > > index 0b7972de04e9..d30dc29c53ff 100644 > > --- a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > > @@ -20,6 +20,30 @@ Description: Weight configuration interface for nodeN > > Minimum weight: 1 > > Maximum weight: 255 > > > > - Writing an empty string or `0` will reset the weight to the > > - system default. The system default may be set by the kernel > > - or drivers at boot or during hotplug events. > > + Writing invalid values (i.e. any values not in [1,255], > > + empty string, ...) will return -EINVAL. > > + > > +What: /sys/kernel/mm/mempolicy/weighted_interleave/mode > > +Date: January 2025 > > +Contact: Linux memory management mailing list <linux-mm@xxxxxxxxx> > > +Description: Auto-weighting configuration interface > > + > > + Configuration modes for weighted interleave. Can take one of > > + two options: "manual" and "auto". Default is "auto". > > + > > + In auto mode, all node weights are re-calculated and overwritten > > + (visible via the nodeN interfaces) whenever new bandwidth data > > + is made available either during boot or hotplug events. > > + > > + In manual mode, node weights can only be updated by the user. > > + If a node is hotplugged while the user is in manual mode, > > + the node will have a default weight of 1. > > + > > + Modes can be changed by writing either "auto" or "manual" to the > > + interface. All other strings will be ignored, and -EINVAL will > > + be returned. If "auto" is written to the interface but the > > + recalculation / updates fail at any point (-ENOMEM or -ENODEV) > > + then the mode will remain in manual mode. > > + > > + Writing a new weight to a node directly via the nodeN interface > > + will also automatically update the system to manual mode. > > I think the last paragraph should also be included in the nodeX parameter. I agree, I will definitely add this in the next version! > > @@ -2450,16 +2548,8 @@ static unsigned long alloc_pages_bulk_array_weighted_interleave(gfp_t gfp, > > if (!weights) > > return total_allocated; > > > > - rcu_read_lock(); > > - table = rcu_dereference(iw_table); > > - if (table) > > - memcpy(weights, table, nr_node_ids); > > - rcu_read_unlock(); > > - > > - /* calculate total, detect system default usage */ > > for_each_node_mask(node, nodes) { > > - if (!weights[node]) > > - weights[node] = 1; > > + weights[node] = get_il_weight(node); > > weight_total += weights[node]; > > } > > Uh-hum... > Looks like it now allows copying weights from different versions of iw_tables? This is a good point, this is actually an artifact from a previous iteration where get_il_weight was needed to handle the weight being 0, but since we no longer allow 0 as a value, it makes more sense to just take a snapshot under a single rcu lock. Thank you for the catch! I will also go over the other places this is used and just make sure the locking behavior is as intended. > Otherwise this patch looks good to me. > > Best, > Hyeonggon Thanks again Hyeonggon, I'll send out a v4 with the changes you mentioned! Have a great day!! Joshua