Hi Balbir,
On 18-Mar-25 10:58 AM, Balbir Singh wrote:
On 3/6/25 16:45, Bharata B Rao wrote:
Hi,
This is an attempt towards having a single subsystem that accumulates
hot page information from lower memory tiers and does hot page
promotion.
At the heart of this subsystem is a kernel daemon named kpromoted that
does the following:
1. Exposes an API that other subsystems which detect/generate memory
access information can use to inform the daemon about memory
accesses from lower memory tiers.
2. Maintains the list of hot pages and attempts to promote them to
toptiers.
Currently I have added AMD IBS driver as one source that provides
page access information as an example. This driver feeds info to
kpromoted in this RFC patchset. More sources were discussed in a
similar context here at [1].
Is hot page promotion mandated or good to have?
If you look at the current hot page promotion (NUMAB=2) logic, IIUC an
accessed lower tier page is directly promoted to toptier if enough space
exists in the toptier node. In such cases, it doesn't even bother about
the hot threshold (measure of how recently it was accessed) or migration
rate limiting. This tells me that it in a tiered memory setup, having an
accessed page in toptier is preferrable.
Memory tiers today
are a function of latency and bandwidth, specifically in
mt_aperf_to_distance()
adist ~ k * R(B)/R(L) where R(x) is relatively performance of the
memory w.r.t DRAM. Do we want hot pages in the top tier all the time?
Are we optimizing for bandwidth or latency?
When memory tiering code converts BW and latency numbers into an opaque
metric adistance based on which the node gets placed at an appropriate
position in the tiering hierarchy, I wonder if it is still possible to
say if we are optimizing for bandwidth or latency separately?
This is just an early attempt to check what it takes to maintain
a single source of page hotness info and also separate hot page
detection mechanisms from the promotion mechanism. There are too
many open ends right now and I have listed a few of them below.
<snip>
This is just an early RFC posted now to ignite some discussion
in the context of LSFMM [2].
I look forward to any summary of the discussions
Sure. Thanks,
Bharata.