Re: [PATCH 2/2] mm/vmscan: calculate reclaimed slab caches in all reclaim paths

Yafang Shao <laoar.shao@xxxxxxxxx> · Mon, 24 Jun 2019 20:30:12 +0800

On Mon, Jun 24, 2019 at 4:53 PM Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote:
>
> On 21.06.2019 13:14, Yafang Shao wrote:
> > There're six different reclaim paths by now,
> > - kswapd reclaim path
> > - node reclaim path
> > - hibernate preallocate memory reclaim path
> > - direct reclaim path
> > - memcg reclaim path
> > - memcg softlimit reclaim path
> >
> > The slab caches reclaimed in these paths are only calculated in the above
> > three paths.
> >
> > There're some drawbacks if we don't calculate the reclaimed slab caches.
> > - The sc->nr_reclaimed isn't correct if there're some slab caches
> >   relcaimed in this path.
> > - The slab caches may be reclaimed thoroughly if there're lots of
> >   reclaimable slab caches and few page caches.
> >   Let's take an easy example for this case.
> >   If one memcg is full of slab caches and the limit of it is 512M, in
> >   other words there're approximately 512M slab caches in this memcg.
> >   Then the limit of the memcg is reached and the memcg reclaim begins,
> >   and then in this memcg reclaim path it will continuesly reclaim the
> >   slab caches until the sc->priority drops to 0.
> >   After this reclaim stops, you will find there're few slab caches left,
> >   which is less than 20M in my test case.
> >   While after this patch applied the number is greater than 300M and
> >   the sc->priority only drops to 3.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> > ---
> >  mm/vmscan.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 18a66e5..d6c3fc8 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3164,11 +3164,13 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> >       if (throttle_direct_reclaim(sc.gfp_mask, zonelist, nodemask))
> >               return 1;
> >
> > +     current->reclaim_state = &sc.reclaim_state;
> >       trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
> >
> >       nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
> >
> >       trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
> > +     current->reclaim_state = NULL;
>
> Shouldn't we remove reclaim_state assignment from __perform_reclaim() after this?
>

Oh yes. We should remove it. Thanks for pointing out.
I will post a fix soon.

Thanks
Yafang

> >       return nr_reclaimed;
> >  }
> > @@ -3191,6 +3193,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
> >       };
> >       unsigned long lru_pages;
> >
> > +     current->reclaim_state = &sc.reclaim_state;
> >       sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> >                       (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> >
> > @@ -3212,7 +3215,9 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
> >                                       cgroup_ino(memcg->css.cgroup),
> >                                       sc.nr_reclaimed);
> >
> > +     current->reclaim_state = NULL;
> >       *nr_scanned = sc.nr_scanned;
> > +
> >       return sc.nr_reclaimed;
> >  }
> >
> > @@ -3239,6 +3244,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
> >               .may_shrinkslab = 1,
> >       };
> >
> > +     current->reclaim_state = &sc.reclaim_state;
> >       /*
> >        * Unlike direct reclaim via alloc_pages(), memcg's reclaim doesn't
> >        * take care of from where we get pages. So the node where we start the
> > @@ -3263,6 +3269,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
> >       trace_mm_vmscan_memcg_reclaim_end(
> >                               cgroup_ino(memcg->css.cgroup),
> >                               nr_reclaimed);
> > +     current->reclaim_state = NULL;
> >
> >       return nr_reclaimed;
> >  }
> >
>