Re: Page allocator bottleneck

Tariq Toukan <tariqt@xxxxxxxxxxxx> · Mon, 18 Sep 2017 18:33:20 +0300

On 18/09/2017 10:44 AM, Aaron Lu wrote:
On Mon, Sep 18, 2017 at 03:34:47PM +0800, Aaron Lu wrote:
On Sun, Sep 17, 2017 at 07:16:15PM +0300, Tariq Toukan wrote:

It's nice to have the option to dynamically play with the parameter.
But maybe we should also think of changing the default fraction guaranteed
to the PCP, so that unaware admins of networking servers would also benefit.

I collected some performance data with will-it-scale/page_fault1 process
mode on different machines with different pcp->batch sizes, starting
from the default 31(calculated by zone_batchsize(), 31 is the standard
value for any zone that has more than 1/2MiB memory), then incremented
by 31 upwards till 527. PCP's upper limit is 6*batch.

An image is plotted and attached: batch_full.png(full here means the
number of process started equals to CPU number).

To be clear: X-axis is the value of batch size(31, 62, 93, ..., 527),
Y-axis is the value of per_process_ops, generated by will-it-scale,
higher is better.

 From the image:
- For EX machines, they all see throughput increase with increased batch
   size and peaked at around batch_size=310, then fall;
- For EP machines, Haswell-EP and Broadwell-EP also see throughput
   increase with increased batch size and peaked at batch_size=279, then
   fall, batch_size=310 also delivers pretty good result. Skylake-EP is
   quite different in that it doesn't see any obvious throughput increase
   after batch_size=93, though the trend is still increasing, but in a very
   small way and finally peaked at batch_size=403, then fall.
   Ivybridge EP behaves much like desktop ones.
- For Desktop machines, they do not see any obvious changes with
   increased batch_size.

So the default batch size(31) doesn't deliver good enough result, we
probbaly should change the default value.

Thanks Aaron for sharing your experiment results.
That's a good analysis of the effect of the batch value.
I agree with your conclusion.

From networking perspective, we should reconsider the defaults to be 
able to reach the increasing NICs linerates.
Not only for pcp->batch, but also for pcp->high.

Regards,
Tariq

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>