Re: [RFC PATCH 0/2] mm: mempolicy: Multi-tier interleaving

"Huang, Ying" <ying.huang@xxxxxxxxx> · Thu, 28 Sep 2023 14:14:32 +0800

Hi, Ravi,

Thanks for the patch!

Ravi Jonnalagadda <ravis.opensrc@xxxxxxxxxx> writes:

> From: Ravi Shankar <ravis.opensrc@xxxxxxxxxx>
>
> Hello,
>
> The current interleave policy operates by interleaving page requests
> among nodes defined in the memory policy. To accommodate the
> introduction of memory tiers for various memory types (e.g., DDR, CXL,
> HBM, PMEM, etc.), a mechanism is needed for interleaving page requests
> across these memory types or tiers.

Why do we need interleaving page allocation among memory tiers?  I think
that you need to make it more explicit.  I guess that it's to increase
maximal memory bandwidth for workloads?

> This can be achieved by implementing an interleaving method that
> considers the tier weights.
> The tier weight will determine the proportion of nodes to select from
> those specified in the memory policy.
> A tier weight can be assigned to each memory type within the system.

What is the problem of the original interleaving?  I think you need to
make it explicit too.

> Hasan Al Maruf had put forth a proposal for interleaving between two
> tiers, namely the top tier and the low tier. However, this patch was
> not adopted due to constraints on the number of available tiers.
>
> https://lore.kernel.org/linux-mm/YqD0%2FtzFwXvJ1gK6@xxxxxxxxxxx/T/
>
> New proposed changes:
>
> 1. Introducea sysfs entry to allow setting the interleave weight for each
> memory tier.
> 2. Each tier with a default weight of 1, indicating a standard 1:1
> proportion.
> 3. Distribute the weight of that tier in a uniform manner across all nodes.
> 4. Modifications to the existing interleaving algorithm to support the
> implementation of multi-tier interleaving based on tier-weights.
>
> This is inline with Huang, Ying's presentation in lpc22, 16th slide in
> https://lpc.events/event/16/contributions/1209/attachments/1042/1995/\
> Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf

Thanks to refer to the original work about this.

> Observed a significant increase (165%) in bandwidth utilization
> with the newly proposed multi-tier interleaving compared to the
> traditional 1:1 interleaving approach between DDR and CXL tier nodes,
> where 85% of the bandwidth is allocated to DDR tier and 15% to CXL
> tier with MLC -w2 option.

It appears that "mlc" isn't an open source software.  Better to use a
open source software to test.  And, even better to use a more practical
workloads instead of a memory bandwidth/latency measurement tool.

> Usage Example:
>
> 1. Set weights for DDR (tier4) and CXL(teir22) tiers.
> echo 85 > /sys/devices/virtual/memory_tiering/memory_tier4/interleave_weight
> echo 15 > /sys/devices/virtual/memory_tiering/memory_tier22/interleave_weight
>
> 2. Interleave between DRR(tier4, node-0) and CXL (tier22, node-1) using numactl
> numactl -i0,1 mlc --loaded_latency W2
>
> Srinivasulu Thanneeru (2):
>   memory tier: Introduce sysfs for tier interleave weights.
>   mm: mempolicy: Interleave policy for tiered memory nodes
>
>  include/linux/memory-tiers.h |  27 ++++++++-
>  include/linux/sched.h        |   2 +
>  mm/memory-tiers.c            |  67 +++++++++++++++-------
>  mm/mempolicy.c               | 107 +++++++++++++++++++++++++++++++++--
>  4 files changed, 174 insertions(+), 29 deletions(-)

--
Best Regards,
Huang, Ying