Hi, Ravi, Thanks for the patch! Ravi Jonnalagadda <ravis.opensrc@xxxxxxxxxx> writes: > From: Ravi Shankar <ravis.opensrc@xxxxxxxxxx> > > Hello, > > The current interleave policy operates by interleaving page requests > among nodes defined in the memory policy. To accommodate the > introduction of memory tiers for various memory types (e.g., DDR, CXL, > HBM, PMEM, etc.), a mechanism is needed for interleaving page requests > across these memory types or tiers. Why do we need interleaving page allocation among memory tiers? I think that you need to make it more explicit. I guess that it's to increase maximal memory bandwidth for workloads? > This can be achieved by implementing an interleaving method that > considers the tier weights. > The tier weight will determine the proportion of nodes to select from > those specified in the memory policy. > A tier weight can be assigned to each memory type within the system. What is the problem of the original interleaving? I think you need to make it explicit too. > Hasan Al Maruf had put forth a proposal for interleaving between two > tiers, namely the top tier and the low tier. However, this patch was > not adopted due to constraints on the number of available tiers. > > https://lore.kernel.org/linux-mm/YqD0%2FtzFwXvJ1gK6@xxxxxxxxxxx/T/ > > New proposed changes: > > 1. Introducea sysfs entry to allow setting the interleave weight for each > memory tier. > 2. Each tier with a default weight of 1, indicating a standard 1:1 > proportion. > 3. Distribute the weight of that tier in a uniform manner across all nodes. > 4. Modifications to the existing interleaving algorithm to support the > implementation of multi-tier interleaving based on tier-weights. > > This is inline with Huang, Ying's presentation in lpc22, 16th slide in > https://lpc.events/event/16/contributions/1209/attachments/1042/1995/\ > Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf Thanks to refer to the original work about this. > Observed a significant increase (165%) in bandwidth utilization > with the newly proposed multi-tier interleaving compared to the > traditional 1:1 interleaving approach between DDR and CXL tier nodes, > where 85% of the bandwidth is allocated to DDR tier and 15% to CXL > tier with MLC -w2 option. It appears that "mlc" isn't an open source software. Better to use a open source software to test. And, even better to use a more practical workloads instead of a memory bandwidth/latency measurement tool. > Usage Example: > > 1. Set weights for DDR (tier4) and CXL(teir22) tiers. > echo 85 > /sys/devices/virtual/memory_tiering/memory_tier4/interleave_weight > echo 15 > /sys/devices/virtual/memory_tiering/memory_tier22/interleave_weight > > 2. Interleave between DRR(tier4, node-0) and CXL (tier22, node-1) using numactl > numactl -i0,1 mlc --loaded_latency W2 > > Srinivasulu Thanneeru (2): > memory tier: Introduce sysfs for tier interleave weights. > mm: mempolicy: Interleave policy for tiered memory nodes > > include/linux/memory-tiers.h | 27 ++++++++- > include/linux/sched.h | 2 + > mm/memory-tiers.c | 67 +++++++++++++++------- > mm/mempolicy.c | 107 +++++++++++++++++++++++++++++++++-- > 4 files changed, 174 insertions(+), 29 deletions(-) -- Best Regards, Huang, Ying