On Tue, Dec 24, 2019 at 02:53:25AM -0500, Yafang Shao wrote: > The lru walker isolation function may use this memcg to do something, e.g. > the inode isolatation function will use the memcg to do inode protection in > followup patch. So make memcg visible to the lru walker isolation function. > > Something should be emphasized in this patch is it replaces > for_each_memcg_cache_index() with for_each_mem_cgroup() in > list_lru_walk_node(). Because there's a gap between these two MACROs that > for_each_mem_cgroup() depends on CONFIG_MEMCG while the other one depends > on CONFIG_MEMCG_KMEM. But as list_lru_memcg_aware() returns false if > CONFIG_MEMCG_KMEM is not configured, it is safe to this replacement. > > Cc: Dave Chinner <dchinner@xxxxxxxxxx> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> .... > @@ -299,17 +299,15 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid, > list_lru_walk_cb isolate, void *cb_arg, > unsigned long *nr_to_walk) > { > + struct mem_cgroup *memcg; > long isolated = 0; > - int memcg_idx; > > - isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg, > - nr_to_walk); > - if (*nr_to_walk > 0 && list_lru_memcg_aware(lru)) { > - for_each_memcg_cache_index(memcg_idx) { > + if (list_lru_memcg_aware(lru)) { > + for_each_mem_cgroup(memcg) { > struct list_lru_node *nlru = &lru->node[nid]; > > spin_lock(&nlru->lock); > - isolated += __list_lru_walk_one(nlru, memcg_idx, > + isolated += __list_lru_walk_one(nlru, memcg, > isolate, cb_arg, > nr_to_walk); > spin_unlock(&nlru->lock); > @@ -317,7 +315,11 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid, > if (*nr_to_walk <= 0) > break; > } > + } else { > + isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg, > + nr_to_walk); > } > + That's a change of behaviour. The old code always runs per-node reclaim, then if the LRU is memcg aware it also runs the memcg aware reclaim. The new code never runs global per-node reclaim if the list is memcg aware, so shrinkers that are initialised with the flags SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE seem likely to have reclaim problems with mixed memcg/global memory pressure scenarios. e.g. if all the memory is in the per-node lists, and the memcg needs to reclaim memory because of a global shortage, it is now unable to reclaim global memory..... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx