This patchset is based on mmots as of Feb 9th, 2016. The baseline is important as there are a number of kswapd-related fixes in that tree and a comparison against v4.10-rc7 would be almost meaningless as a result. The series is unusual in that the first patch fixes one problem and introduces a host of other issues and is incomplete. It was not developed by me but it appears to have gotten lost so I picked it up and added to the changelog. Patch 2 makes a minor modification that is worth considering on its own but leaves the kernel in a state where it behaves badly. It's not until patch 3 that there is an improvement against baseline. This was mostly motivated by examining Chris Mason's "simoop" benchmark which puts the VM under similar pressure to HADOOP. It has been reported that the benchmark has regressed severely during the last number of releases. While I cannot reproduce all the same problems Chris experienced due to hardware limitations, there was a number of problems on a 2-socket machine with a single disk. 4.10.0-rc7 4.10.0-rc7 mmots-20170209 keepawake-v1r25 Amean p50-Read 22325202.49 ( 0.00%) 22092755.48 ( 1.04%) Amean p95-Read 26102988.80 ( 0.00%) 26101849.04 ( 0.00%) Amean p99-Read 30935176.53 ( 0.00%) 29746220.52 ( 3.84%) Amean p50-Write 976.44 ( 0.00%) 952.73 ( 2.43%) Amean p95-Write 15471.29 ( 0.00%) 3140.27 ( 79.70%) Amean p99-Write 35108.62 ( 0.00%) 8843.73 ( 74.81%) Amean p50-Allocation 76382.61 ( 0.00%) 76349.22 ( 0.04%) Amean p95-Allocation 127777.39 ( 0.00%) 108630.26 ( 14.98%) Amean p99-Allocation 187937.39 ( 0.00%) 139094.26 ( 25.99%) These are latencies. Read/write are threads reading fixed-size random blocks from a simulated database. The allocation latency is mmaping and faulting regions of memory. The p50, 95 and p99 reports the worst latencies for 50% of the samples, 95% and 99% respectively. For example, the report indicates that while the test was running 99% of writes completed 74.81% faster. It's worth noting that on a UMA machine that no difference in performance with simoop was observed so milage will vary. On UMA, there was a notable difference in the "stutter" benchmark which measures the latency of mmap while large files are being copied. This has been used as a proxy measure for desktop jitter while large amounts of IO were taking place 4.10.0-rc7 4.10.0-rc7 mmots-20170209 keepawake-v1 Min mmap 6.3847 ( 0.00%) 5.9785 ( 6.36%) 1st-qrtle mmap 7.6310 ( 0.00%) 7.4086 ( 2.91%) 2nd-qrtle mmap 9.9959 ( 0.00%) 7.7052 ( 22.92%) 3rd-qrtle mmap 14.8180 ( 0.00%) 8.5895 ( 42.03%) Max-90% mmap 15.8397 ( 0.00%) 13.6974 ( 13.52%) Max-93% mmap 16.4268 ( 0.00%) 14.3175 ( 12.84%) Max-95% mmap 18.3295 ( 0.00%) 16.9233 ( 7.67%) Max-99% mmap 24.2042 ( 0.00%) 20.6182 ( 14.82%) Max mmap 255.0688 ( 0.00%) 265.5818 ( -4.12%) Mean mmap 11.2192 ( 0.00%) 9.1811 ( 18.17%) Latency is measured in milliseconds and indicates that 99% of mmap operations complete 14.82% faster and are 18.17% faster on average with these patches applied. mm/memory_hotplug.c | 2 +- mm/vmscan.c | 128 +++++++++++++++++++++++++++++----------------------- 2 files changed, 72 insertions(+), 58 deletions(-) -- 2.11.0 Mel Gorman (2): mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Shantanu Goel (1): mm, vmscan: fix zone balance check in prepare_kswapd_sleep mm/memory_hotplug.c | 2 +- mm/vmscan.c | 128 +++++++++++++++++++++++++++++----------------------- 2 files changed, 72 insertions(+), 58 deletions(-) -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>