Re: [PATCH] mm: skip current when memcg reclaim

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Fri, 15 Oct 2021 19:58:37 -0700

On Sat, 16 Oct 2021 10:28:54 +0800 Zhaoyang Huang <huangzhaoyang@xxxxxxxxx> wrote:

> On Sat, Oct 16, 2021 at 4:00 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, 15 Oct 2021 14:15:29 +0800 Huangzhaoyang <huangzhaoyang@xxxxxxxxx> wrote:
> >
> > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > >
> > > Sibling thread of the same process could refault the reclaimed pages
> > > in the same time, which would be typical in None global reclaim and
> > > introduce thrashing.
> >
> > "None" -> "node", I assume?
> >
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -2841,6 +2841,11 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
> > >                               sc->memcg_low_skipped = 1;
> > >                               continue;
> > >                       }
> > > +                     /*
> > > +                      * Don't bother current when its memcg is below low
> > > +                      */
> >
> > The comment explains what the code is doing, but the code itself
> > already does this.  Please can we have a comment that explains *why*
> > the code is doing this?
> We find that the patch help direct reclaiming bail out early and
> eliminate page thrashing for some scenarios(etc APP start on android).
> The case could be worse if each APP possess a unique memcg(pages on
> current's lru are reclaimed more than global reclaim)

What I meant was: please redo the patch with a comment which explains
"why the code does this", rather than "what the code does".

Also, as this is a performance enhancement, it is preferred to have
some performance testing results in the changelog.