Re: [PATCH] mm, memcg: fix wrong mem cgroup protection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 24, 2020 at 06:21:03PM +0200, Michal Hocko wrote:
> On Fri 24-04-20 11:10:13, Johannes Weiner wrote:
> > On Fri, Apr 24, 2020 at 04:29:58PM +0200, Michal Hocko wrote:
> > > On Fri 24-04-20 09:14:50, Johannes Weiner wrote:
> > > > On Thu, Apr 23, 2020 at 02:16:29AM -0400, Yafang Shao wrote:
> > > > > This patch is an improvement of a previous version[1], as the previous
> > > > > version is not easy to understand.
> > > > > This issue persists in the newest kernel, I have to resend the fix. As
> > > > > the implementation is changed, I drop Roman's ack from the previous
> > > > > version.
> > > > 
> > > > Now that I understand the problem, I much prefer the previous version.
> > > > 
> > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > index 745697906ce3..2bf91ae1e640 100644
> > > > --- a/mm/memcontrol.c
> > > > +++ b/mm/memcontrol.c
> > > > @@ -6332,8 +6332,19 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
> > > >  
> > > >  	if (!root)
> > > >  		root = root_mem_cgroup;
> > > > -	if (memcg == root)
> > > > +	if (memcg == root) {
> > > > +		/*
> > > > +		 * The cgroup is the reclaim root in this reclaim
> > > > +		 * cycle, and therefore not protected. But it may have
> > > > +		 * stale effective protection values from previous
> > > > +		 * cycles in which it was not the reclaim root - for
> > > > +		 * example, global reclaim followed by limit reclaim.
> > > > +		 * Reset these values for mem_cgroup_protection().
> > > > +		 */
> > > > +		memcg->memory.emin = 0;
> > > > +		memcg->memory.elow = 0;
> > > >  		return MEMCG_PROT_NONE;
> > > > +	}
> > > 
> > > Could you be more specific why you prefer this over the
> > > mem_cgroup_protection which doesn't change the effective value?
> > > Isn't it easier to simply ignore effective value for the reclaim roots?
> > 
> > Because now both mem_cgroup_protection() and mem_cgroup_protected()
> > have to know about the reclaim root semantics, instead of just the one
> > central place.
> 
> Yes this is true but it is also potentially overwriting the state with
> a parallel reclaim which can lead to surprising results

Checking in mem_cgroup_protection() doesn't avoid the fundamental race:

  root
     `- A (low=2G, elow=2G, max=3G)
        `- A1 (low=2G, elow=2G)

If A does limit reclaim while global reclaim races, the memcg == root
check in mem_cgroup_protection() will reliably calculate the "right"
scan value for A, which has no pages, and the wrong scan value for A1
where the memory actually is.

I'm okay with fixing the case where a really old left-over value is
used by target reclaim.

I don't see a point in special casing this one instance of a
fundamental race condition at the expense of less robust code.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux