Re: [RFC] Memory tiering kernel alignment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24/01/25 10:26AM, David Rientjes wrote:
> Hi everybody,
> 
> There is a lot of excitement around upcoming CXL type 3 memory expansion
> devices and their cost savings potential.  As the industry starts to
> adopt this technology, one of the key components in strategic planning is
> how the upstream Linux kernel will support various tiered configurations
> to meet various user needs.  I think it goes without saying that this is
> quite interesting to cloud providers as well as other hyperscalers :)
> 
> I think this discussion would benefit from a collaborative approach
> between various stakeholders and interested parties.  Reason being is
> that there are several different use cases the need different support
> models, but also because there is great incentive toward moving "with"
> upstream Linux for this support rather than having multiple parties
> bringing up their own stacks only to find that they are diverging from
> upstream rather than converging with it.
> 
> I'm interested to learn if there is interest in forming a "Linux Memory
> Tiering Work Group" to share ideas, discuss multi-faceted approaches, and
> keep track of work items?
> 
> Some recent discussions have proven that there is widespread interest in
> some very foundational topics for this technology such as:
> 
>  - Decoupling CPU balancing from memory balancing (or obsoleting CPU
>    balancing entirely)
> 
>    + John Hubbard notes this would be useful for GPUs:
> 
>       a) GPUs have their own processors that are invisible to the kernel's
>          NUMA "which tasks are active on which NUMA nodes" calculations,
>          and
> 
>       b) Similar to where CXL is generally going, we have already built
>          fully memory-coherent hardware, which include memory-only NUMA
>          nodes.
> 
>  - In-kernel hot memory abstraction, informed by hardware hinting drivers
>    (incl some architectures like Power10), usable as a NUMA Balancing
>    backend for promotion and other areas of the kernel like transparent
>    hugepage utilization
> 
>  - NUMA and memory tiering enlightenment for accelerators, such as for
>    optimal use of GPU memory, extremely important for a cloud provider
>    (hint hint :)
> 
>  - Asynchronous memory promotion independent of task_numa_fault() while
>    considering the cost of page migration (due to identifying cold memory)
> 
> It looks like there is already some interest in such a working group that
> would have a biweekly discussion of shared interests with the goal of
> accelerating design, development, testing, and division of work:
> 
> Alistair Popple
> Aneesh Kumar K V
> Brian Morris
> Christoph Lameter
> Dan Williams
> Gregory Price
> Grimm, Jon
> Huang, Ying
> Johannes Weiner
> John Hubbard
> Zi Yan
> 
> Specifically for the in-kernel hot memory abstraction topic, Google and
> Meta recently publushed an OCP base specification "Hyperscale CXL Tiered
> Memory Expander Specification" available at
> https://drive.google.com/file/d/1fFfU7dFmCyl6V9-9qiakdWaDr9d38ewZ/view?usp=drive_link
> that would be great to discuss.
> 
> There is also on-going work in the CXL Consortium to standardize some of
> the abstractions for CXL 3.1.
> 
> If folks are interested in this topic and your name doesn't appear above
> (I already got you :), please:
> 
>  - reply-all to this email to express interest and expand upon the list
>    of topics above to represent additional areas of interest that should
>    be included, *or*
> 
>  - email me privately to express interest to make sure you are included
> 
> Perhaps I'm overly optimistic, but one thing that would be absolutely
> *amazing* would be if we all have a very clear and understandable vision
> for how Linux will support the wide variety of use cases, even before
> that work is fully implemented (or even designed), by LSF/MM/BPF 2024
> time in May.
> 
> Thanks!
> 

Please add me to the cxl interested parties list. 

John Groves (jgroves@xxxxxxxxxx / John@xxxxxxxxxxxxxx)







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux