On Fri, 13 Dec 2024 15:19:20 +0900 Hyeonggon Yoo <hyeonggon.yoo@xxxxxx> wrote: > On 2024-12-11 06:54 AM, Joshua Hahn wrote: > > This patch introduces an auto-configuration for the interleave weights > > that aims to balance the two goals of setting node weights to be > > proportional to their bandwidths and keeping the weight values low. > > This balance is controlled by a value max_node_weight, which defines the > > maximum weight a single node can take. > > Hi Joshua, > > I am wondering how this is going to work for host memory + CXL memory > interleaving. I guess by "the ACPI table" you mean the ACPI HMAT or CXL > CDAT, both of which does not provide the bandwidth of host memory. > If this feature does not consider the bandwidth of host memory, manual > configuration will be inevitable anyway. Hi Hyeonggon, Thank you for reviewing my patch! As Gregory showed in his reply, I think it would be possible to get host bandwidth information using the ACPI HMAT. [-----8<-----] > > +What: /sys/kernel/mm/mempolicy/weighted_interleave/max_node_weight > > +Date: December 2024 > > +Contact: Linux memory management mailing list <linux-mm@xxxxxxxxx> > > +Description: Weight limiting / scaling interface > > + > > + The maximum interleave weight for a memory node. When it is > > + updated, any previous changes to interleave weights (i.e. via > > + the nodeN sysfs interfaces) are ignored, and new weights are > > + calculated using ACPI-reported bandwidths and scaled. > > + > > At first this paragraph sounded like "previously stored weights are > discarded after setting max_node_weight", but I think you mean > "User can override the default values, but defaults values are > calculated regardless of the values set by the user". Right? In the implementation, the first way you interpreted is the correct description. That is, if a user manually changes a ndoe weight, then updates the max_node_weight, the previous manual change will be overwritten by the newly scaled values. Does this behavior make sense? Perhaps it makes sense to ignore user-changed values when doing the re-scaling if a user decides to change the max_node_weight value themselves. Regardless of what implementation makes sense, I can re-write the description so that there is no ambiguity when it comes to the expected behavior of the code. Thank you for pointing this out! > [...snip...] > > > +int mempolicy_set_node_perf(unsigned int node, struct access_coordinate *coords) > > +{ > > + unsigned long *old_bw, *new_bw; > > + unsigned long bw_val; > > + u8 *old_iw, *new_iw; > > + > > + /* > > + * Bandwidths above this limit causes rounding errors when reducing > > + * weights. This value is ~16 exabytes, which is unreasonable anyways. > > + */ > > + bw_val = min(coords->read_bandwidth, coords->write_bandwidth); > > + if (bw_val > (U64_MAX / 10)) > > + return -EINVAL; > > + > > + new_bw = kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL); > > + if (!new_bw) > > + return -ENOMEM; > > + > > + new_iw = kzalloc(nr_node_ids, GFP_KERNEL); > > I think kcalloc(nr_node_ids, sizeof(u8), GFP_KERNEL); will be more readable. I see, thank you for your input. I will make this change in a v2. > > @@ -2012,11 +2105,12 @@ static unsigned int weighted_interleave_nid(struct mempolicy *pol, pgoff_t ilx) > > > > rcu_read_lock(); > > table = rcu_dereference(iw_table); > > + defaults = rcu_dereference(iw_table); > > Probably you intended rcu_dereference(default_iw_table)? Yes -- thank you for the catch. I will also make this change. > > static struct iw_node_attr **node_attrs; > > +static struct kobj_attribute *max_nw_attr; > > Where is max_nw_attr initialized? Oh thank you for this catch! You are correct, max_nw_attr is never initalized. Actually, there is a typo in which I never use max_nw_attr, I accidentally rename a different sysfs interface to act as the intended max_nw_attr. I will make this change as well and post a v2. > Best, > Hyeonggon Thank you for your input, I will make the changes that you mentioned regardnig readability & typos. I hope to hear from you regarding the thoughts on the behavior of re-scaling all node weights when users update max_node_weight, and whether that should overwrite manually set node weights. Have a great day! Joshua