On Wed, Feb 16, 2011 at 6:50 PM, Mel Gorman <mel@xxxxxxxxx> wrote: > should_continue_reclaim() for reclaim/compaction allows scanning to continue > even if pages are not being reclaimed until the full list is scanned. In > terms of allocation success, this makes sense but potentially it introduces > unwanted latency for high-order allocations such as transparent hugepages > and network jumbo frames that would prefer to fail the allocation attempt > and fallback to order-0 pages. ÂWorse, there is a potential that the full > LRU scan will clear all the young bits, distort page aging information and > potentially push pages into swap that would have otherwise remained resident. > > This patch will stop reclaim/compaction if no pages were reclaimed in the > last SWAP_CLUSTER_MAX pages that were considered. For allocations such as > hugetlbfs that use GFP_REPEAT and have fewer fallback options, the full LRU > list may still be scanned. > > To test this, a tool was developed based on ftrace that tracked the latency of > high-order allocations while transparent hugepage support was enabled and three > benchmarks were run. The "fix-infinite" figures are 2.6.38-rc4 with Johannes's > patch "vmscan: fix zone shrinking exit when scan work is done" applied. > > STREAM Highorder Allocation Latency Statistics >        fix-infinite   break-early > 1 :: Count      Â10298      10229 > 1 :: Min       0.4560     Â0.4640 > 1 :: Mean      Â1.0589     Â1.0183 > 1 :: Max      Â14.5990     11.7510 > 1 :: Stddev     Â0.5208     Â0.4719 > 2 :: Count        Â2        1 > 2 :: Min       1.8610     Â3.7240 > 2 :: Mean      Â3.4325     Â3.7240 > 2 :: Max       5.0040     Â3.7240 > 2 :: Stddev     Â1.5715     Â0.0000 > 9 :: Count      111696     Â111694 > 9 :: Min       0.5230     Â0.4110 > 9 :: Mean      10.5831     10.5718 > 9 :: Max      Â38.4480     43.2900 > 9 :: Stddev     Â1.1147     Â1.1325 > > Mean time for order-1 allocations is reduced. order-2 looks increased > but with so few allocations, it's not particularly significant. THP mean > allocation latency is also reduced. That said, allocation time varies so > significantly that the reductions are within noise. > > Max allocation time is reduced by a significant amount for low-order > allocations but reduced for THP allocations which presumably are now > breaking before reclaim has done enough work. > > SysBench Highorder Allocation Latency Statistics >        fix-infinite   break-early > 1 :: Count      Â15745      15677 > 1 :: Min       0.4250     Â0.4550 > 1 :: Mean      Â1.1023     Â1.0810 > 1 :: Max      Â14.4590     10.8220 > 1 :: Stddev     Â0.5117     Â0.5100 > 2 :: Count        Â1        1 > 2 :: Min       3.0040     Â2.1530 > 2 :: Mean      Â3.0040     Â2.1530 > 2 :: Max       3.0040     Â2.1530 > 2 :: Stddev     Â0.0000     Â0.0000 > 9 :: Count       2017      Â1931 > 9 :: Min       0.4980     Â0.7480 > 9 :: Mean      10.4717     10.3840 > 9 :: Max      Â24.9460     26.2500 > 9 :: Stddev     Â1.1726     Â1.1966 > > Again, mean time for order-1 allocations is reduced while order-2 allocations > are too few to draw conclusions from. The mean time for THP allocations is > also slightly reduced albeit the reductions are within varianes. > > Once again, our maximum allocation time is significantly reduced for > low-order allocations and slightly increased for THP allocations. > > Anon stream mmap reference Highorder Allocation Latency Statistics > 1 :: Count       1376      Â1790 > 1 :: Min       0.4940     Â0.5010 > 1 :: Mean      Â1.0289     Â0.9732 > 1 :: Max       6.2670     Â4.2540 > 1 :: Stddev     Â0.4142     Â0.2785 > 2 :: Count        Â1        - > 2 :: Min       1.9060        - > 2 :: Mean      Â1.9060        - > 2 :: Max       1.9060        - > 2 :: Stddev     Â0.0000        - > 9 :: Count      Â11266      11257 > 9 :: Min       0.4990     Â0.4940 > 9 :: Mean    Â27250.4669   Â24256.1919 > 9 :: Max   Â11439211.0000  Â6008885.0000 > 9 :: Stddev   226427.4624   186298.1430 > > This benchmark creates one thread per CPU which references an amount of > anonymous memory 1.5 times the size of physical RAM. This pounds swap quite > heavily and is intended to exercise THP a bit. > > Mean allocation time for order-1 is reduced as before. It's also reduced > for THP allocations but the variations here are pretty massive due to swap. > As before, maximum allocation times are significantly reduced. > > Overall, the patch reduces the mean and maximum allocation latencies for > the smaller high-order allocations. This was with Slab configured so it > would be expected to be more significant with Slub which uses these size > allocations more aggressively. > > The mean allocation times for THP allocations are also slightly reduced. > The maximum latency was slightly increased as predicted by the comments due > to reclaim/compaction breaking early. However, workloads care more about the > latency of lower-order allocations than THP so it's an acceptable trade-off. > Please consider merging for 2.6.38. > > Signed-off-by: Mel Gorman <mel@xxxxxxxxx> > --- > Âmm/vmscan.c |  32 ++++++++++++++++++++++---------- > Â1 files changed, 22 insertions(+), 10 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 148c6e6..591b907 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1841,16 +1841,28 @@ static inline bool should_continue_reclaim(struct zone *zone, >    Âif (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION)) >        Âreturn false; > > -    /* > -    Â* If we failed to reclaim and have scanned the full list, stop. > -    Â* NOTE: Checking just nr_reclaimed would exit reclaim/compaction far > -    Â*    faster but obviously would be less likely to succeed > -    Â*    allocation. If this is desirable, use GFP_REPEAT to decide Typo. __GFP_REPEAT Otherwise, looks good to me. Reviewed-by: Minchan Kim <minchan.kim@xxxxxxxxx> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href