On Sat, Dec 28, 2019 at 7:49 AM Roman Gushchin <guro@xxxxxx> wrote: > > On Fri, Dec 27, 2019 at 07:43:53AM -0500, Yafang Shao wrote: > > memory.{emin, elow} are set in mem_cgroup_protected(), and the values of > > them won't be changed until next recalculation in this function. After > > either or both of them are set, the next reclaimer to relcaim this memcg > > may be a different reclaimer, e.g. this memcg is also the root memcg of > > the new reclaimer, and then in mem_cgroup_protection() in get_scan_count() > > the old values of them will be used to calculate scan count, that is not > > proper. We should reset them to zero in this case. > > > > Here's an example of this issue. > > > > root_mem_cgroup > > / > > A memory.max=1024M memory.min=512M memory.current=800M > > > > Once kswapd is waked up, it will try to scan all MEMCGs, including > > this A, and it will assign memory.emin of A with 512M. > > After that, A may reach its hard limit(memory.max), and then it will > > do memcg reclaim. Because A is the root of this reclaimer, so it will > > not calculate its memory.emin. So the memory.emin is the old value > > 512M, and then this old value will be used in > > mem_cgroup_protection() in get_scan_count() to get the scan count. > > That is not proper. > > > > Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > Cc: Chris Down <chris@xxxxxxxxxxxxxx> > > Cc: Roman Gushchin <guro@xxxxxx> > > Cc: stable@xxxxxxxxxxxxxxx > > --- > > mm/memcontrol.c | 11 ++++++++++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 601405b..bb3925d 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -6287,8 +6287,17 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, > > > > if (!root) > > root = root_mem_cgroup; > > - if (memcg == root) > > + if (memcg == root) { > > + /* > > + * Reset memory.(emin, elow) for reclaiming the memcg > > + * itself. > > + */ > > + if (memcg != root_mem_cgroup) { > > + memcg->memory.emin = 0; > > + memcg->memory.elow = 0; > > + } > > I'm sorry, that didn't bring it from scratch, but I doubt that zeroing effecting > protection is correct. Imagine a simple config: a large cgroup subtree with memory.max > set on the top level. Reaching this limit doesn't mean that all protection > configuration inside the tree can be ignored. > No, they won't be ignored. Pls. see the logic in mem_cgroup_protected(), it will re-calculate all its children's effective min and low. > Instead we should respect memory.low/max set by a user on this level > (look at the parent == root case), maybe clamped by memory.high/max. > Let's look at the parent == root case. What if the parent is the root_mem_cgroup? The memory.{emin, elow} of root_mem_cgroup is always 0 right ? So what's your problem ? Thanks Yafang