Re: [PATCH v2 4/5] mm: make memcg visible to lru walker isolation function

Yafang Shao <laoar.shao@xxxxxxxxx> · Sat, 4 Jan 2020 15:26:13 +0800

On Sat, Jan 4, 2020 at 11:36 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> On Tue, Dec 24, 2019 at 02:53:25AM -0500, Yafang Shao wrote:
> > The lru walker isolation function may use this memcg to do something, e.g.
> > the inode isolatation function will use the memcg to do inode protection in
> > followup patch. So make memcg visible to the lru walker isolation function.
> >
> > Something should be emphasized in this patch is it replaces
> > for_each_memcg_cache_index() with for_each_mem_cgroup() in
> > list_lru_walk_node(). Because there's a gap between these two MACROs that
> > for_each_mem_cgroup() depends on CONFIG_MEMCG while the other one depends
> > on CONFIG_MEMCG_KMEM. But as list_lru_memcg_aware() returns false if
> > CONFIG_MEMCG_KMEM is not configured, it is safe to this replacement.
> >
> > Cc: Dave Chinner <dchinner@xxxxxxxxxx>
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
>
> ....
>
> > @@ -299,17 +299,15 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid,
> >                                list_lru_walk_cb isolate, void *cb_arg,
> >                                unsigned long *nr_to_walk)
> >  {
> > +     struct mem_cgroup *memcg;
> >       long isolated = 0;
> > -     int memcg_idx;
> >
> > -     isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg,
> > -                                   nr_to_walk);
> > -     if (*nr_to_walk > 0 && list_lru_memcg_aware(lru)) {
> > -             for_each_memcg_cache_index(memcg_idx) {
> > +     if (list_lru_memcg_aware(lru)) {
> > +             for_each_mem_cgroup(memcg) {
> >                       struct list_lru_node *nlru = &lru->node[nid];
> >
> >                       spin_lock(&nlru->lock);
> > -                     isolated += __list_lru_walk_one(nlru, memcg_idx,
> > +                     isolated += __list_lru_walk_one(nlru, memcg,
> >                                                       isolate, cb_arg,
> >                                                       nr_to_walk);
> >                       spin_unlock(&nlru->lock);
> > @@ -317,7 +315,11 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid,
> >                       if (*nr_to_walk <= 0)
> >                               break;
> >               }
> > +     } else {
> > +             isolated += list_lru_walk_one(lru, nid, NULL, isolate, cb_arg,
> > +                                           nr_to_walk);
> >       }
> > +
>
> That's a change of behaviour. The old code always runs per-node
> reclaim, then if the LRU is memcg aware it also runs the memcg
> aware reclaim. The new code never runs global per-node reclaim
> if the list is memcg aware, so shrinkers that are initialised
> with the flags SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE seem
> likely to have reclaim problems with mixed memcg/global memory
> pressure scenarios.
>
> e.g. if all the memory is in the per-node lists, and the memcg needs
> to reclaim memory because of a global shortage, it is now unable to
> reclaim global memory.....
>

Hi Dave,

Thanks for your detailed explanation.
But I have different understanding.
The difference between for_each_mem_cgroup(memcg) and
for_each_memcg_cache_index(memcg_idx) is that the
for_each_mem_cgroup() includes the root_mem_cgroup while the
for_each_memcg_cache_index() excludes the root_mem_cgroup because the
memcg_idx of it is -1.
So it can reclaim global memory even if the list is memcg aware.
Is that right ?

Thanks
Yafang