On Wed, Oct 18, 2023 at 04:29:02PM +0800, Huang, Ying wrote: > Gregory Price <gregory.price@xxxxxxxxxxxx> writes: > > > There are at least 5 proposals that i know of at the moment > > > > 1) mempolicy > > 2) memory-tiers > > 3) memory-block interleaving? (weighting among blocks inside a node) > > Maybe relevant if Dynamic Capacity devices arrive, but it seems > > like the wrong place to do this. > > 4) multi-device nodes (e.g. cxl create-region ... mem0 mem1...) > > 5) "just do it in hardware" > > It may be easier to start with the use case. What is the practical use > cases in your mind that can not be satisfied with simple per-memory-tier > weight? Can you compare the memory layout with different proposals? > Before I delve in, one clarifying question: When you asked whether weights should be part of node or memory-tiers, i took that to mean whether it should be part of mempolicy or memory-tiers. Were you suggesting that weights should actually be part of drivers/base/node.c? Because I had not considered that, and this seems reasonable, easy to implement, and would not require tying mempolicy.c to memory-tiers.c Beyond this, i think there's been 3 imagined use cases (now, including this). a) numactl --weighted-interleave=Node:weight,0:16,1:4,... b) echo weight > /sys/.../memory-tiers/memtier/access0/interleave_weight numactl --interleave=0,1 c) echo weight > /sys/bus/node/node0/access0/interleave_weight numactl --interleave=0,1 d) options b or c, but with --weighted-interleave=0,1 instead this requires libnuma changes to pick up, but it retains --interleave as-is to avoid user confusion. The downside of an approach like A (which was my original approach), was that the weights cannot really change should a node be hotplugged. Tasks would need to detect this and change the policy themselves. That's not a good solution. However in both B and C's design, weights can be rebalanced in response to any number of events. Ultimately B and C are equivalent, but the placement in nodes is cleaner and more intuitive. If memory-tiers wants to use/change this information, there's nothing that prevents it. Assuming this is your meaning, I agree and I will pivot to this. ~Gregory