On Tue, Feb 27, 2024 at 08:38:19AM +0800, Huang, Ying wrote: > Gregory Price <gregory.price@xxxxxxxxxxxx> writes: > > Where are the 100 nodes coming from? > > If you have a real large machine with more than 100 nodes, and some of > them are CXL memory nodes, then it's possible that most nodes will have > interleave weight "1" because the sum of all interleave weights is > "100". Then, even if you use only one socket, the interleave weight of > DRAM and CXL MEM could be all "1", lead to useless default value. So, I > suggest don't cap the sum of interleave weights. I have to press this issue: Is this an actual, practical, concern? It seems to me in this type of scenario, there are larger, more complex numa topology issues that make the use of the general, global weighted mempolicy system entirely impractical. This is a bit outside the scope > > So, long winded winded way of saying: > > - Could we use a larger default number? Yes. > > - Does that actually help us? Not really, we want smaller numbers. > > The larger number will be reduced after GCD. > I suppose another strategy is to calculate the interleave weights un-bounded from the raw bandwidth - but continuously force reductions (through some yet-undefined algorithm) until at least one node reaches a weight of `1`. This suffers from the opposite problem: what if the top node has a value greater than 255? Do we just cap it at 255? That seems the opposite form of problematic. (Large numbers are quite pointless, as it is essentially the antithesis of interleave) > > I think I'll draft up an LSF/MM chat to see if we can garner more input. > > If large-numa systems are a real issue, then yes we need to address it. > > Sounds good to me! Working on it. ~Gregory