Re: Slow-tier Page Promotion discussion recap and open questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 18, 2024 at 07:56:19PM -0500, Gregory Price wrote:
> On Tue, Dec 17, 2024 at 08:19:56PM -0800, David Rientjes wrote:
> > ----->o-----
> > Raghu noted the current promotion destination is node 0 by default.  Wei
> > noted we could get some page owner information to determine things like
> > mempolicies or compute the distance between nodes and, if multiple nodes
> > have the same distance, choose one of them just as we do for demotions.
> > 
> > Gregory Price noted some downsides to using mempolicies for this based on
> > per-task, per-vma, and cross socket policies, so using the kernel's
> > memory tiering policies is probably the best way to go about it.
> > 
> 
> Slightly elaborating here:
> - In an async context, associating a page with a specific task is not
>   presently possible (that I know of). The most we know is the last
>   accessing CPU - maybe - in the page/folio struct.  Right now this
>   is disabled in favor of a timestamp when tiering is enabled.
> 
>   a process with 2 tasks which have access to the page may not run
>   on the same socket, so we run the risk of migrating to a bad target.
>   Best effort here would suggest either socket is fine - since they're
>   both "fast nodes" - but this requires that we record the last 
>   accessing CPU for a page at identification time.
> 

This can be sovled with a two steps migration: first, you promote the
page from CXL to a NUMA node, then you rely on NUMA balancing to
further place the page into the right NUMA node. NUMA hint faults can
still be enabled for pages allocated from NUMA nodes, but not for CXL.

Best
Karim
Edinburgh University





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux