On Thu 20-08-15 14:42:40, Mel Gorman wrote: > On Thu, Aug 20, 2015 at 03:18:43PM +0200, Michal Hocko wrote: > > On Wed 12-08-15 11:45:26, Mel Gorman wrote: > > [...] > > > 4-node machine stutter > > > 4-node machine stutter > > > 4.2.0-rc1 4.2.0-rc1 > > > vanilla nozlc-v1r20 > > > Min mmap 53.9902 ( 0.00%) 49.3629 ( 8.57%) > > > 1st-qrtle mmap 54.6776 ( 0.00%) 54.1201 ( 1.02%) > > > 2nd-qrtle mmap 54.9242 ( 0.00%) 54.5961 ( 0.60%) > > > 3rd-qrtle mmap 55.1817 ( 0.00%) 54.9338 ( 0.45%) > > > Max-90% mmap 55.3952 ( 0.00%) 55.3929 ( 0.00%) > > > Max-93% mmap 55.4766 ( 0.00%) 57.5712 ( -3.78%) > > > Max-95% mmap 55.5522 ( 0.00%) 57.8376 ( -4.11%) > > > Max-99% mmap 55.7938 ( 0.00%) 63.6180 (-14.02%) > > > Max mmap 6344.0292 ( 0.00%) 67.2477 ( 98.94%) > > > Mean mmap 57.3732 ( 0.00%) 54.5680 ( 4.89%) > > > > Do you have data for other leads? Because the reclaim counters look > > quite discouraging to be honest. > > > > None of the other workloads showed changes that were worth reporting. OK, that is a good sign. I would agree that an extreme and artificial load shouldn't be considered as a blocker. > > > 4.1.0 4.1.0 > > > vanilla nozlc-v1r4 > > > Swap Ins 838 502 > > > Swap Outs 1149395 2622895 > > > > Twice as much swapouts is a lot. > > > > > DMA32 allocs 17839113 15863747 > > > Normal allocs 129045707 137847920 > > > Direct pages scanned 4070089 29046893 > > > > 7x more scanns by direct reclaim also sounds bad. > > > > With this benchmark, the results for stutter will be highly variable as > it's hammering the system. The intent of the test was to measure stalls at > a time when desktop interactivity went to hell during IO and could stall > for several minutes. Due to it nature, there is intense reclaim *and* > compaction activity going on and there is no point drawing conclusions > from the reclaim stats that are inherently good or bad. > > There will be differences in direct reclaim figures because instead of > looping in the page allocator waiting for zlc to clear, it'll enter direct > reclaim. OK, I haven't considered this. kswapd might be stuck for quite some time but all of them being stuck shouldn't be that likely. But still, this is not a desirable behavior. > In effect, the zlc causes processes to busy loop while kswapd > does the work. If it turns out that this is the correct behaviour then > we should do that explicitly, not rely on the broken zlc behaviour for > the same reason we no longer rely on sprinkling congestion_wait() all > over the place. Fair point. I do agree that this should be done outside of get_page_from_freelist. I am still surprised by the considerable increase of swapouts but that should be handled separately if we see that in the real world loads. That being said Acked-by: Michal Hocko <mhocko@xxxxxxxx> -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>