Re: [PATCH v5] mm/mempolicy: Weighted Interleave Auto-tuning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 7 Feb 2025 18:20:09 -0800 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Fri,  7 Feb 2025 12:13:35 -0800 Joshua Hahn <joshua.hahnjy@xxxxxxxxx> wrote:
> 
> > This patch introduces an auto-configuration mode for the interleave
> > weights that aims to balance the two goals of setting node weights to be
> > proportional to their bandwidths and keeping the weight values low.
> > In order to perform the weight re-scaling, we use an internal
> > "weightiness" value (fixed to 32) that defines interleave aggression.
> 
> Question please.  How does one determine whether a particular
> configuration is working well?  To determine whether
> manual-configuration-A is better than manual-configuration-B is better
> than auto-configuration?
> 
> Leading to... how do we know that this patch makes the kernel better?

Hello Andrew,

Thank you for your interest in this patch!

To answer your 1st question: I think that users can do some
experimentation with the specific workloads they expect to be running
with. In particular, since the weights that provide the best results
are workload-specific, it might make sense to compare the results across
a variety of workloads that the users might be expecting and comparing
what settings provide the least amount of throttling. 

With that said, this patch introduces defaults that will hopefully help
those who are either unable or uninterested in setting weights themselves.
For users who already have already been using weighted interleave and
know what specific weights they should use, the auto settings might not
give as much impact as someone who is unsure what the best weights are
(and would rather defer the decision-making to the system).

As for measuring the accuracy of the default weights generated:
The auto mode works by taking nodes' bandwidth data and trying to use
small numbers (between 1 and 255) to approximate those bandwidth values.
For instance, [19000, 4000, 7000] might be converted to something like
[4:1:2], since of course we don't want to be allocating from the second
node only after 19000 pages have already been allocated from the first.

But simultaneously... 4:1:2 is not the same ratio as 19000:4000:7000.
So there is a tradeoff between trying to get accurate weight values,
while keeping them small as to not have unbalanced distributions. This
is where we chose the value of 32 to be the magic "weightiness" value.

Gregory and I spent quite some time modeling this behavior, trying
different reduction algorithms and weightiness to see what could give
us the most accurate bandwidth data while using the most reasonably
small numbers possible, and ended up with 32. (Earlier versions of
this patch also exposed the weightiness parameter as a sysfs knob,
but it was removed for simplicity's sake.)

We've gotten some nice results (under reasonable conditions) after
running exhaustive tests for a wide array of bandwidth configurations,
which is why we were confident with selecting 32 as the default value.

As for the 2nd question and how this patch makes the kernel better : -)
Like I mentioned above, this patch might not have a large impact to
those already using weighted interleave to see performance gains and
know what weights work the best. However, we believe there are users
out there who (1) have nodes with varying bandwidths (CXL),
(2) have workloads that are bandwidth-bound, and (3) would like to
take advantage of weighted interleave but do not have the capacity
or are not willing to manually change the weights. For these folks,
having defaults that make sense (as opposed to the previous defaults
in weighted interleave, which would make it functionally the same
as unweighted interleave) can provide more options and performance
gains to those who wish to opt-in.

I apologize for the long explanation, but I hope that this answers
your question. Please let me know if there is anything else I can do!

Thank you again for your interest. I hope you have a great day!
Joshua




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux