Re: [LSF/MM/BPF TOPIC] The future of memory tiering

Jason Gunthorpe <jgg@xxxxxxxx> · Mon, 1 May 2023 10:16:49 -0300

On Wed, Apr 26, 2023 at 09:30:54PM -0700, David Rientjes wrote:
> Hi everybody,
> 
> As requested, sending along a last minute topic suggestion for 
> consideration for LSF/MM/BPF 2023 :)
> 
> For a sizable set of emerging technologies, memory tiering presents one of 
> the most formidable challenges and exicting opportunities for the MM 
> subsystem today.
> 
> "Memory tiering" can mean many different things based on the user: from 
> traditional every day NUMA, to swap (to zswap), to NVDIMMs, to HBM, to 
> locally attached CXL memory, to memory borrowing over PCIe, to memory 
> pooling with disaggregation, and beyond.
> 
> Just as NUMA started out only being useful for the supercomputers, memory 
> tiering will likely evolve over the next five years to take on an 
> expanding set of use cases, and likely with rapidly increasing adoption 
> even beyond hyperscalers.
> 
> I think a discussion about memory tiering would be highly valuable.  A few 
> key questions that I think can drive this discussion:
> 
>  - What are the various form factors that must be supported as short-term 
>    goals as well as need to be supported 5+ years into the future?
> 
>  - What incremental changes need to be made on top of NUMA support to
>    fully support the wide range of use cases that will be coming?  (Is
>    memory tiering support built entirely upon NUMA?)
> 
>  - What is the minimum viable *default* support that the MM subsystem 
>    should provide for tiered configs?  What are the set of optimizations
>    that should be left to userspace or BPF to control?
> 
>  - What are the various page promotion technqiues that we must plan for
>    beyond traditional NUMA balancing that will allow us to exploit
>    hardware innovation?
> 
> (And I'm sure there are more topics of discussion that others would 
> readily add.  It would be great to have additional ideas in replies.)
> 
> A key challenge in all of this is to make memory tiering support in the 
> upstream kernel compatible with the roadmaps of various CPU vendors.  A 
> key goal is to ensure the end user benefits from all of this rapid 
> innovation with generalized support that is well abstracted and allows for 
> extensibility.

I'm interested in this too, memory pools with strong locality to
specific compute blocks are becoming an increasing feature in
supercomputer build outs. It would be great to see a comprehensive
approach to this in the mm, not just solving the "external
slower dram" approach.

Jason