On Mon, Jul 21, 2014 at 05:54:11PM +0200, Michal Hocko wrote: > From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > commit b738d764652dc5aab1c8939f637112981fce9e0e upstream > > shrink_inactive_list() used to wait 0.1s to avoid congestion when all > the pages that were isolated from the inactive list were dirty but not > under active writeback. That makes no real sense, and apparently causes > major interactivity issues under some loads since 3.11. > > The ostensible reason for it was to wait for kswapd to start writing > pages, but that seems questionable as well, since the congestion wait > code seems to trigger for kswapd itself as well. Also, the logic behind > delaying anything when we haven't actually started writeback is not > clear - it only delays actually starting that writeback. > > We'll still trigger the congestion waiting if > > (a) the process is kswapd, and we hit pages flagged for immediate > reclaim > > (b) the process is not kswapd, and the zone backing dev writeback is > actually congested. > > This probably needs to be revisited, but as it is this fixes a reported > regression. > > [mhocko@xxxxxxx: backport to 3.12 stable tree] > Fixes: e2be15f6c3ee ('mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered') This seems to be applicable to the 3.11 kernel as well. If there are no objections, I'll queue it. Cheers, -- Luís > Reported-by: Felipe Contreras <felipe.contreras@xxxxxxxxx> > Pinpointed-by: Hillf Danton <dhillf@xxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxx> > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx> > --- > mm/vmscan.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 1d891f49587b..5ad29b2925a0 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1522,19 +1522,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, > * If dirty pages are scanned that are not queued for IO, it > * implies that flushers are not keeping up. In this case, flag > * the zone ZONE_TAIL_LRU_DIRTY and kswapd will start writing > - * pages from reclaim context. It will forcibly stall in the > - * next check. > + * pages from reclaim context. > */ > if (nr_unqueued_dirty == nr_taken) > zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY); > > /* > - * In addition, if kswapd scans pages marked marked for > - * immediate reclaim and under writeback (nr_immediate), it > - * implies that pages are cycling through the LRU faster than > + * If kswapd scans pages marked marked for immediate > + * reclaim and under writeback (nr_immediate), it implies > + * that pages are cycling through the LRU faster than > * they are written so also forcibly stall. > */ > - if (nr_unqueued_dirty == nr_taken || nr_immediate) > + if (nr_immediate) > congestion_wait(BLK_RW_ASYNC, HZ/10); > } > > -- > 2.0.1 > > -- > To unsubscribe from this list: send the line "unsubscribe stable" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html