Re: [PATCH] mm: page_alloc: Default node-ordering on 64-bit NUMA, zone-ordering on 32-bit v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 09, 2014 at 03:46:30PM +0100, Mel Gorman wrote:
> Changelog since v1
> o Default to zone-ordering on 32-bit and remove heuristics
> o Expand changelog
> 
> Zones are allocated by the page allocator in either node or zone order.
> Node ordering is preferred in terms of locality and is applied automatically
> in one of three cases.
> 
>   1. If a node has only low memory
> 
>   2. If DMA/DMA32 is a high percentage of memory
> 
>   3. If low memory on a single node is greater than 70% of the node size
> 
> Otherwise zone ordering is used to preserve low memory for devices that
> require it. Unfortunately a consequence of this is that a machine with
> balanced NUMA nodes will experience different performance characteristics
> depending on which node they happen to start from.
> 
> The point of zone ordering is to protect lower nodes for devices that
> require DMA/DMA32 memory. When NUMA was first introduced, this was critical
> as 32-bit NUMA machines existed and exhausting low memory triggered OOMs
> easily as so many allocations required low memory. On 64-bit machines the
> primary concern is devices that are 32-bit only which is less severe than
> the low memory exhaustion problem on 32-bit NUMA. It seems there are really
> few devices that depends on it.
> 
> AGP -- I assume this is getting more rare but even then I think the allocations
> 	happen early in boot time where lowmem pressure is less of a problem
> 
> DRM -- If the device is 32-bit only then there may be low pressure. I didn't
> 	evaluate these in detail but it looks like some of these are mobile
> 	graphics card. Not many NUMA laptops out there. DRM folk should know
> 	better though.
> 
> Some TV cards -- Much demand for 32-bit capable TV cards on NUMA machines?
> 
> B43 wireless card -- again not really a NUMA thing.
> 
> I cannot find a good reason to incur a performance penalty on all 64-bit NUMA
> machines in case someone throws a brain damanged TV or graphics card in there.
> This patch defaults to node-ordering on 64-bit NUMA machines. I was tempted
> to make it default everywhere but I understand that some embedded arches may
> be using 32-bit NUMA where I cannot predict the consequences.
> 
> The performance impact depends on the workload and the characteristics of the
> machine and the machine I tested on had a large Normal zone on node 0 so the
> impact is within the noise for the majority of tests. The allocation stats
> show more allocation requests were from DMA32 and local node. Running SpecJBB
> with multiple JVMs and automatic NUMA balancing disabled the results were
> 
> specjbb
>                      3.17.0-rc2            3.17.0-rc2
>                         vanilla        nodeorder-v1r1
> Min    1      29534.00 (  0.00%)     30020.00 (  1.65%)
> Min    10    115717.00 (  0.00%)    134038.00 ( 15.83%)
> Min    19    109718.00 (  0.00%)    114186.00 (  4.07%)
> Min    28    104459.00 (  0.00%)    103639.00 ( -0.78%)
> Min    37     98245.00 (  0.00%)    103756.00 (  5.61%)
> Min    46     97198.00 (  0.00%)     96197.00 ( -1.03%)
> Mean   1      30953.25 (  0.00%)     31917.75 (  3.12%)
> Mean   10    124432.50 (  0.00%)    140904.00 ( 13.24%)
> Mean   19    116033.50 (  0.00%)    119294.75 (  2.81%)
> Mean   28    108365.25 (  0.00%)    106879.50 ( -1.37%)
> Mean   37    102984.75 (  0.00%)    106924.25 (  3.83%)
> Mean   46    100783.25 (  0.00%)    105368.50 (  4.55%)
> Stddev 1       1260.38 (  0.00%)      1109.66 ( 11.96%)
> Stddev 10      7434.03 (  0.00%)      5171.91 ( 30.43%)
> Stddev 19      8453.84 (  0.00%)      5309.59 ( 37.19%)
> Stddev 28      4184.55 (  0.00%)      2906.63 ( 30.54%)
> Stddev 37      5409.49 (  0.00%)      3192.12 ( 40.99%)
> Stddev 46      4521.95 (  0.00%)      7392.52 (-63.48%)
> Max    1      32738.00 (  0.00%)     32719.00 ( -0.06%)
> Max    10    136039.00 (  0.00%)    148614.00 (  9.24%)
> Max    19    130566.00 (  0.00%)    127418.00 ( -2.41%)
> Max    28    115404.00 (  0.00%)    111254.00 ( -3.60%)
> Max    37    112118.00 (  0.00%)    111732.00 ( -0.34%)
> Max    46    108541.00 (  0.00%)    116849.00 (  7.65%)
> TPut   1     123813.00 (  0.00%)    127671.00 (  3.12%)
> TPut   10    497730.00 (  0.00%)    563616.00 ( 13.24%)
> TPut   19    464134.00 (  0.00%)    477179.00 (  2.81%)
> TPut   28    433461.00 (  0.00%)    427518.00 ( -1.37%)
> TPut   37    411939.00 (  0.00%)    427697.00 (  3.83%)
> TPut   46    403133.00 (  0.00%)    421474.00 (  4.55%)
> 
>                             3.17.0-rc2  3.17.0-rc2
>                                vanillanodeorder-v1r1
> DMA allocs                           0           0
> DMA32 allocs                        57     1491992
> Normal allocs                 32543566    30026383
> Movable allocs                       0           0
> Direct pages scanned                 0           0
> Kswapd pages scanned                 0           0
> Kswapd pages reclaimed               0           0
> Direct pages reclaimed               0           0
> Kswapd efficiency                 100%        100%
> Kswapd velocity                  0.000       0.000
> Direct efficiency                 100%        100%
> Direct velocity                  0.000       0.000
> Percentage direct scans             0%          0%
> Zone normal velocity             0.000       0.000
> Zone dma32 velocity              0.000       0.000
> Zone dma velocity                0.000       0.000
> THP fault alloc                  55164       52987
> THP collapse alloc                 139         147
> THP splits                          26          21
> NUMA alloc hit                 4169066     4250692
> NUMA alloc miss                      0           0
> 
> Note that there were more DMA32 allocations with the patch applied.  In this
> particular case there was no difference in numa_hit and numa_miss. The
> expectation is that DMA32 was being used at the low watermark instead of
> falling into the slow path. kswapd was not woken but it's not worken for
> THP allocations.
> 
> On 32-bit, this patch defaults to zone-ordering as low memory depletion
> can be a serious problem on 32-bit large memory machines. If the default
> ordering was node then processes on node 0 will deplete the Normal zone
> due to normal activity.  The problem is worse if CONFIG_HIGHPTE is not
> set. If combined with large amounts of dirty/writeback pages in Normal
> zone then there is also a high risk of OOM. The heuristics are removed
> as it's not clear they were ever important on 32-bit. They were only
> relevant for setting node-ordering on 64-bit.
> 
> Signed-off-by: Mel Gorman <mgorman@xxxxxxx>

Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]