On Thu, 3 Jan 2013 19:09:01 +0100 Michal Hocko <mhocko@xxxxxxx> wrote: > Hi, > I have posted this quite some time ago > (https://lkml.org/lkml/2012/12/14/102) but it probably slipped through > --- > >From 28b4e10bc3c18b82bee695b76f4bf25c03baa5f8 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@xxxxxxx> > Date: Fri, 14 Dec 2012 11:12:43 +0100 > Subject: [PATCH] memcg,vmscan: do not break out targeted reclaim without > reclaimed pages > > Targeted (hard resp. soft) reclaim has traditionally tried to scan one > group with decreasing priority until nr_to_reclaim (SWAP_CLUSTER_MAX > pages) is reclaimed or all priorities are exhausted. The reclaim is > then retried until the limit is met. > > This approach, however, doesn't work well with deeper hierarchies where > groups higher in the hierarchy do not have any or only very few pages > (this usually happens if those groups do not have any tasks and they > have only re-parented pages after some of their children is removed). > Those groups are reclaimed with decreasing priority pointlessly as there > is nothing to reclaim from them. > > An easiest fix is to break out of the memcg iteration loop in shrink_zone > only if the whole hierarchy has been visited or sufficient pages have > been reclaimed. This is also more natural because the reclaimer expects > that the hierarchy under the given root is reclaimed. As a result we can > simplify the soft limit reclaim which does its own iteration. > > Reported-by: Ying Han <yinghan@xxxxxxxxxx> But what was in that report? My guess would be "excessive CPU consumption", and perhaps "excessive reclaim in the higher-level memcgs". IOW, what are the user-visible effects of this change? (And congrats - you're the first person I've sent that sentence to this year! But not, I fear, the last) I don't really understand what prevents limit reclaim from stealing lots of pages from the top-level groups. How do we ensure balancing/fairness in this case? > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1973,18 +1973,17 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) shrink_zone() might be getting a bit bloaty for CONFIG_MEMCG=n kernels. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>