On 2/15/2018 6:10 PM, Mel Gorman wrote: > On Wed, Jan 24, 2018 at 09:25:37PM +0530, Vinayak Menon wrote: >> Hi, >> >> It is observed that watermark_scale_factor when used to reduce thundering herds >> in direct reclaim, reduces the direct reclaims, but results in unnecessary reclaim >> due to kswapd running for long after being woken up. The tests are done with 4 GB >> of RAM and the tests done are multibuild and another which opens a set of apps >> sequentially on Android and repeating the sequence N times. The tests are done on >> 4.9 kernel. >> >> The issue seems to be because of watermark_scale_factor creating larger gap between >> low and high watermarks. The following results are with watermark_scale_factor of 120 >> and the other with watermark_scale_factor 120 with a reduced gap between low and >> high watermarks. The patch used to reduce the gap is given below. The min-low gap is >> untouched. It can be seen that with the reduced low-high gap, the direct reclaims are >> almost same as base, but with 45% less pgpgin. Reduced low-high gap improves the >> latency by around 11% in the sequential app test due to lesser IO and kswapd activity. >> >> wsf-120-default wsf-120-reduced-low-high-gap >> workingset_activate 15120206 8319182 >> pgpgin 269795482 147928581 >> allocstall 1406 1498 >> pgsteal_kswapd 68676960 38105142 >> slabs_scanned 94181738 49085755 >> >> This is the diff of wsf-120-reduced-low-high-gap for comments. The patch considers >> low-high gap as a fraction of min-low gap, and the fraction a function of managed pages, >> increasing non-linearly. The multiplier 4 is was chosen as a reasonable value which does >> not alter the low-high gap much from the base for large machines. >> > This needs a proper changelog, signed-offs and a comment on the reasoning > behind the new min value for the gap between low and high and how it > was derived. It appears the equation was designed such at the gap, as > a percentage of the zone size, would shrink according as the zone size > increases but I'm not 100% certain that was the intent. That should be > explained and why not just using "tmp >> 2" would have problems. > > It would also need review/testing by Johannes to ensure that there is no > reintroduction of the problems that watermark_scale_factor was designed > to solve. Sorry for the delayed response. I will send a patch with the details. The equation was designed so that the low-high gap is small for smaller RAM sizes and tends towards min-low gap as the RAM size increases. This was done considering that it should not have a bad effect on for 140G configuration which Johannes had taken taken as example when watermark_scale_factor was introduced, also assuming that the thrashing seen due to low-high gap would be visible only on low RAM devices. >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3a11a50..749d1eb 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6898,7 +6898,11 @@ static void __setup_per_zone_wmarks(void) >> watermark_scale_factor, 10000)); >> >> zone->watermark[WMARK_LOW] = min_wmark_pages(zone) + tmp; >> - zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp * 2; >> + >> + tmp = clamp_t(u64, mult_frac(tmp, int_sqrt(4 * zone->managed_pages), >> + 10000), min_wmark_pages(zone) >> 2 , tmp); >> + >> + zone->watermark[WMARK_HIGH] = low_wmark_pages(zone) + tmp; >> >> spin_unlock_irqrestore(&zone->lock, flags); >> } >> >> With the patch, >> With watermark_scale_factor as default 10, the low-high gap: >> unchanged for 140G at 143M, >> for 65G, reduces from 65M to 53M >> for 4GB, reduces from 4M to 1M >> >> With watermark_scale_factor 120, the low-high gap: >> unchanged for 140G >> for 65G, reduces from 786M to 644M >> for 4GB, reduces from 49M to 10M >> > This information should also be in the changelog. Sure. Thanks, Vinayak -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>