On Wed, Feb 19, 2020 at 02:42:31PM -0800, Sultan Alsawaf wrote: > On Wed, Feb 19, 2020 at 09:45:13PM +0000, Mel Gorman wrote: > > This could be watermark boosting run wild again. Can you test with > > sysctl vm.watermark_boost_factor=0 or the following patch? (preferably > > both to compare and contrast). > > I can test that, but something to note is that I've been doing equal testing > with this on 4.4, which exhibits the same behavior, and that kernel doesn't have > watermark boosting in it, as far as I can tell. > > I don't think what we're addressing here is a "bug", but rather something > fundamental about how we've been thinking about kswapd lifetime. The argument > here is that it's not coherent to be letting kswapd run as it does, and instead > gating it on outstanding allocation requests provides much more reasonable > behavior, given real workloads and use patterns. > > Does that make sense and seem reasonable? > I'm not entirely convinced. The reason the high watermark exists is to have kswapd work long enough to make progress without a process having to direct reclaim. The most straight-forward example would be a streaming reader of a large file. It'll keep pushing the zone towards the low watermark and kswapd has to keep ahead of the reader. If we cut kswapd off too quickly, the min watermark is hit and stalls occur. While kswapd could stop at the min watermark, it leaves a very short window for kswapd to make enough progress before the min watermark is hit. At minimum, any change in this area would need to include the /proc/vmstats on allocstat and pg*direct* to ensure that direct reclaim stalls are not worse. I'm not a fan of the patch in question because kswapd can be woken between the low and min watermark without stalling but we really do expect kswapd to make progress and continue to make progress to avoid future stalls. The changelog had no information on the before/after impact of the patch and this is an area where intuition can disagree with real behaviour. -- Mel Gorman SUSE Labs