Re: [RFC 1/1] mm/mempolicy: introduce system default interleave weights

Gregory Price <gregory.price@xxxxxxxxxxxx> · Tue, 27 Feb 2024 01:11:50 -0500

On Tue, Feb 27, 2024 at 01:59:26PM +0800, Huang, Ying wrote:
> Gregory Price <gregory.price@xxxxxxxxxxxx> writes:
> 
> > I have to press this issue: Is this an actual, practical, concern?
> 
> I don't know who have large machine like that.  But I guess that it's
> possible in the long run.
>

Certainly possible, although that seems like a hyper-specialized case of
a supercomputer.  I suppose still worth considering for a bit.

> > I suppose another strategy is to calculate the interleave weights
> > un-bounded from the raw bandwidth - but continuously force reductions
> > (through some yet-undefined algorithm) until at least one node reaches a
> > weight of `1`.  This suffers from the opposite problem: what if the top
> > node has a value greater than 255? Do we just cap it at 255? That seems
> > the opposite form of problematic.
> >
> > (Large numbers are quite pointless, as it is essentially the antithesis
> > of interleave)
> 
> Yes.  So I suggest to use a relative small number as the default weight
> to start with for normal DRAM.  We will have to floor/ceiling the weight
> value.

Yeah more concretely, I was thinking something like

unsigned int *temp_weights; /* sizeof nr_node_ids */

memcpy(temp_weights, node_bandwidth);
while min(temp_weights) > 1:
    - attempt GCD reduction
    - if failed (GCD=1), adjust all odd numbers to be even (+1), try again

for weight in temp_weights:
    iw_table[N] = (weight > 255) ? 255 : (unsigned char)weight;

Something like this.  Of course this breaks if you have two nodes with a
massively different bandwidth ratio (> 255:1), but that seems
unrealistic given the intent of the devices.

~Gregory