Re: [PATCH 2/3] mm, meminit: Recalculate pcpu batch and high limits after init completes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 18 Oct, at 11:56:05AM, Mel Gorman wrote:
> Deferred memory initialisation updates zone->managed_pages during
> the initialisation phase but before that finishes, the per-cpu page
> allocator (pcpu) calculates the number of pages allocated/freed in
> batches as well as the maximum number of pages allowed on a per-cpu list.
> As zone->managed_pages is not up to date yet, the pcpu initialisation
> calculates inappropriately low batch and high values.
> 
> This increases zone lock contention quite severely in some cases with the
> degree of severity depending on how many CPUs share a local zone and the
> size of the zone. A private report indicated that kernel build times were
> excessive with extremely high system CPU usage. A perf profile indicated
> that a large chunk of time was lost on zone->lock contention.
> 
> This patch recalculates the pcpu batch and high values after deferred
> initialisation completes on each node. It was tested on a 2-socket AMD
> EPYC 2 machine using a kernel compilation workload -- allmodconfig and
> all available CPUs.
> 
> mmtests configuration: config-workload-kernbench-max
> Configuration was modified to build on a fresh XFS partition.
> 
> kernbench
>                                 5.4.0-rc3              5.4.0-rc3
>                                   vanilla         resetpcpu-v1r1
> Amean     user-256    13249.50 (   0.00%)    15928.40 * -20.22%*
> Amean     syst-256    14760.30 (   0.00%)     4551.77 *  69.16%*
> Amean     elsp-256      162.42 (   0.00%)      118.46 *  27.06%*
> Stddev    user-256       42.97 (   0.00%)       50.83 ( -18.30%)
> Stddev    syst-256      336.87 (   0.00%)       33.70 (  90.00%)
> Stddev    elsp-256        2.46 (   0.00%)        0.81 (  67.01%)
> 
>                    5.4.0-rc3   5.4.0-rc3
>                      vanillaresetpcpu-v1r1
> Duration User       39766.24    47802.92
> Duration System     44298.10    13671.93
> Duration Elapsed      519.11      387.65
> 
> The patch reduces system CPU usage by 69.16% and total build time by
> 27.06%. The variance of system CPU usage is also much reduced.
> 
> Cc: stable@xxxxxxxxxxxxxxx # v4.15+
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> ---
>  mm/page_alloc.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Tested-by: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux