Re: [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control

Gregory Price <gregory.price@xxxxxxxxxxxx> · Mon, 4 Dec 2023 08:50:12 -0500

On Mon, Dec 04, 2023 at 04:19:02PM +0800, Huang, Ying wrote:
> Gregory Price <gregory.price@xxxxxxxxxxxx> writes:
> 
> > If the structure is built as a matrix of (cpu_node,mem_nodes),
> > the you can also optimize based on the node the task is running on.
> 
> The matrix stuff makes the situation complex.  If people do need
> something like that, they can just use set_memorypolicy2() with user
> specified weights.  I still believe that "make simple stuff simple, and
> complex stuff possible".
> 

I don't think it's particularly complex, since we already have a
distance matrix for numa nodes:

available: 2 nodes (0-1)
... snip ...
node distances:
node   0   1
  0:  10  21
  1:  21  10

This would follow the same thing, just adjustable for bandwidth.

I personally find the (src,dst) matrix very important for flexibility.

But if there is particular pushback against it, having a one dimensional
array is better than not having it, so I will take what I can get.

> > That feels very intuitive, deals with many race condition issues, and
> > the global setting can actually be implemented without the need for
> > set_mempolicy2 at all - which is certainly a bonus.
> >
> > Would love more thoughts here.  Will have a new RFC with set_mempolicy2,
> > mbind2, and MPOL_WEIGHTED_INTERLEAVE soon that demonstrate the above.
> 
> Thanks for doing all these!
> 

Someone's got to :]

~Gregory