Re: [PATCH v5] mm/mempolicy: Weighted Interleave Auto-tuning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 8 Feb 2025 07:51:38 +0100 Oscar Salvador <osalvador@xxxxxxx> wrote:

> On Fri, Feb 07, 2025 at 12:13:35PM -0800, Joshua Hahn wrote:

[...snip...]

> Hi Joshua
> 
> > diff --git a/drivers/base/node.c b/drivers/base/node.c
> > index 0ea653fa3433..16e7a5a8ebe7 100644
> > --- a/drivers/base/node.c
> > +++ b/drivers/base/node.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/init.h>
> >  #include <linux/mm.h>
> >  #include <linux/memory.h>
> > +#include <linux/mempolicy.h>
> >  #include <linux/vmstat.h>
> >  #include <linux/notifier.h>
> >  #include <linux/node.h>
> > @@ -214,6 +215,12 @@ void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
> >  			break;
> >  		}
> >  	}
> > +
> > +	/* When setting CPU access coordinates, update mempolicy */
> > +	if (access == ACCESS_COORDINATE_CPU) {
> > +		if (mempolicy_set_node_perf(nid, coord))
> > +			pr_info("failed to set node%d mempolicy attrs\n", nid);
> 
> Not a big deal but I think you want to make that consistent with the error
> pr_info? that is: "failed to set mempolicy attrs for node %d".

Hi Oscar, thank you for reviewing my patch!
That sounds good to me. Then that line can be replaced with
pr_info("failed to set mempolicy attrs for node %d\n", nid);

> Also, I guess we cannot reach here with a memoryless node, right?

I think that they should not reach this line, but since memoryless
nodes do not have any memory bandwidth / latency information, it should
be fine. With that said, I think a check like the following would
make this more explicit and possibly guard against any unexpected
calls to mempolicy_set_node_perf:

- if (access == ACCESS_COORDINATE_CPU) {
+ if (access == ACCESS_COORDINATE_CPU && !node_state(nid, N_MEMORY)) {

> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 04f35659717a..51edd3663667 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -109,6 +109,7 @@
> >  #include <linux/mmu_notifier.h>
> >  #include <linux/printk.h>
> >  #include <linux/swapops.h>
> > +#include <linux/gcd.h>
> >  
> >  #include <asm/tlbflush.h>
> >  #include <asm/tlb.h>
> > @@ -138,16 +139,18 @@ static struct mempolicy default_policy = {
> >  
> >  static struct mempolicy preferred_node_policy[MAX_NUMNODES];
> >  
> > +static uint64_t *node_bw_table;
> > +
> >  /*
> > - * iw_table is the sysfs-set interleave weight table, a value of 0 denotes
> > - * system-default value should be used. A NULL iw_table also denotes that
> > - * system-default values should be used. Until the system-default table
> > - * is implemented, the system-default is always 1.
> > - *
> > + * iw_table is the interleave weight table.
> > + * If bandwidth data is available and the user is in auto mode, the table
> > + * is populated with default values in [1,255].
> >   * iw_table is RCU protected
> >   */
> >  static u8 __rcu *iw_table;
> >  static DEFINE_MUTEX(iw_table_lock);
> > +static const int weightiness = 32;
> 
> You explain why you chose this value in the changelog, but I would either
> drop a comment, probably in reduce_interleave_weights() elaborating a
> little bit, otherwise someone who stumbles upon that when reading that
> code will have to go through the changelog.

I also think this feedback makes a lot of sense. Maybe something like:
/*
 * 32 is selected as a constant that balances the two goals of:
 * (1) Keeping weight magnitude low, as to prevent too many consecutive
 *     pages from being allocated from the same node before other nodes
 *     get a chance
 * (2) Minimizing the error between bandwidth ratios and weight ratios
 */

Andrew -- I will send over a new v6 for this patch, if that is alright with
you. I will also probably wait a few days before sending it out, in case
there are other changes that folks would like to see. Please let me know
if this works for you / if there is something ele I can do to make this
process smoother.

Thank you again! Have a nice day!
Joshua

> 
> -- 
> Oscar Salvador
> SUSE Labs





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux