Re: RFC: Memory Tiering Kernel Interfaces

Dave Hansen <dave.hansen@xxxxxxxxx> · Tue, 3 May 2022 10:47:20 -0700

On 5/3/22 10:14, Alistair Popple wrote:
> I would certainly be interested in figuring out how HW could provide some sort
> of heatmap to identify which pages are hot and which processing unit is using
> them. Currently for these systems users have to manually assign memory policy to
> get any reasonable performance, both to disable NUMA balancing and make sure
> memory is allocated on the right node.

Autonuma-induced page faults are a total non-starter for lots of
workloads, even ignoring GPUs.  Basically anyone who is latency
sensitive stays far, far away from autonuma.

As for improving on page faults for data collection...

*Can* hardware provide this information?  Definitely.

Have hardware vendors been motivated enough to add hardware to do this?
 Nope, not yet.

Do you know anyone that works for any hardware companies? ;)

Seriously, though.  Folks at Intel _are_ thinking about this problem.
I'm hoping we have hardware some day to help lend a hand.  The more
hardware vendors that do this, the more likely it is that we'll have
good kernel code to consume data from the hardware.