On 5/13/2024 7:19 AM, Davidlohr Bueso wrote:
On Thu, 09 May 2024, Huang, Ying wrote:
With the default configuration, current NUMA balancing based promotion
solution will almost try to promote any faulting pages. To select hot
pages to promote and control thrashing between NUMA nodes, the promote
rate limit needs to be configured. For example, via,
echo 200 > /proc/sys/kernel/numa_balancing_promote_rate_limit_MBps
200MB hot pages will be selected and promoted every second. Can you
try it?
Yes, I've played with this tunnable and, just like the LRU approach, it
shows nice micro wins (less amount of promotions/demotions) but little for
actual benchmark improvements at a higher level, merely noise level or
very sublte wins. In fact, the actual data from that series for this
parameter was a ~2% pmbench win with the rate limiting, but a 69% promotion
rate descrease.
And this is really my point, how much effort do we want to put in
optimizing
software mechanisms for hot page detection? Are there other benchmarks we
should be using?
Yes, some representative benchmarks to evaluate the effectiveness of hot
page promotion would be useful.
Recently there was a discussion about the effectiveness of hot page
detection in the context of a micro-benchmark. More details here:
https://lore.kernel.org/linux-mm/929b22ca-bb51-4307-855f-9b4ae0a102e3@xxxxxxx/T/#m04eb5d9dfb30133156d4dcb33b09b89a4e9299ea
And perhaps doing the async promotion and not incurring in
the numa balancing overhead and comparing the cost of migration before
promoting would yield some better numbers, but that also might be easy to
get wrong when compared to the relative hotness of the page.
Regards,
Bharata.