Re: [PATCH v2] mm/vmscan: check references from all memcgs for swapbacked memory

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 5 Oct 2022 14:01:55 -0700

On Wed, Oct 5, 2022 at 1:48 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>
> On Wed, Oct 5, 2022 at 11:37 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> >
> > During page/folio reclaim, we check if a folio is referenced using
> > folio_referenced() to avoid reclaiming folios that have been recently
> > accessed (hot memory). The rationale is that this memory is likely to be
> > accessed soon, and hence reclaiming it will cause a refault.
> >
> > For memcg reclaim, we currently only check accesses to the folio from
> > processes in the subtree of the target memcg. This behavior was
> > originally introduced by commit bed7161a519a ("Memory controller: make
> > page_referenced() cgroup aware") a long time ago. Back then, refaulted
> > pages would get charged to the memcg of the process that was faulting them
> > in. It made sense to only consider accesses coming from processes in the
> > subtree of target_mem_cgroup. If a page was charged to memcg A but only
> > being accessed by a sibling memcg B, we would reclaim it if memcg A is
> > is the reclaim target. memcg B can then fault it back in and get charged
> > for it appropriately.
> >
> > Today, this behavior still makes sense for file pages. However, unlike
> > file pages, when swapbacked pages are refaulted they are charged to the
> > memcg that was originally charged for them during swapping out. Which
> > means that if a swapbacked page is charged to memcg A but only used by
> > memcg B, and we reclaim it from memcg A, it would simply be faulted back
> > in and charged again to memcg A once memcg B accesses it. In that sense,
> > accesses from all memcgs matter equally when considering if a swapbacked
> > page/folio is a viable reclaim target.
> >
> > Modify folio_referenced() to always consider accesses from all memcgs if
> > the folio is swapbacked.
>
> It seems to me this change can potentially increase the number of
> zombie memcgs. Any risk assessment done on this?

Do you mind elaborating the case(s) where this could happen? Is this
the cgroup v1 case in mem_cgroup_swapout() where we are reclaiming
from a zombie memcg and swapping out would let us move the charge to
the parent?