Re: Slow-tier Page Promotion discussion recap and open questions

Nadav Amit <nadav.amit@xxxxxxxxx> · Wed, 18 Dec 2024 17:21:59 +0200

> On 18 Dec 2024, at 6:19, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> 
> Hi everybody,
> 
> We had a very interactive discussion last week led by RaghavendraKT on
> slow-tier page promotion intended for memory tiering platforms, thank
> you!  Thanks as well to everybody who attended and provided great
> questions, suggestions, and feedback.
> 
> The RFC patch series "mm: slowtier page promotion based on PTE A bit"[1]
> is a proposal to allow for asynchronous page promotion based on memory
> accesses as an alternative to NUMA Balancing based promotions.  There was
> widespread interest in this topic and the discussion surfaced multiple
> use cases and requirements, very focused on CXL use cases.
> 

Just sharing my 2 cents.

IIUC, the suggested approach has two benefits:

1. Fewer/no page-faults (as A-bit is used to detect usage)
2. Batching

While (2) seems like a win that might be added un top of AutoNUMA, (1)
is more delicate. As indicated in the patch-set, the "exact destination”
is lost. At the same time, the last time I checked, the A-bit setting
wasn’t free and cost something like 550 cycles (others saw similar
results [1]).

So considering empty page-fault is ~1050 cycles (2014 number Linus
measured [2]), there is a question how big of a win it is...

[1] https://lore.kernel.org/all/20160620000606.GB3194@blaptop/
[2] Google+ post RIP