On Sat, Dec 28, 2019 at 11:00 AM Roman Gushchin <guro@xxxxxx> wrote: > > On Sat, Dec 28, 2019 at 09:45:11AM +0800, Yafang Shao wrote: > > On Sat, Dec 28, 2019 at 7:49 AM Roman Gushchin <guro@xxxxxx> wrote: > > > > > > On Fri, Dec 27, 2019 at 07:43:53AM -0500, Yafang Shao wrote: > > > > memory.{emin, elow} are set in mem_cgroup_protected(), and the values of > > > > them won't be changed until next recalculation in this function. After > > > > either or both of them are set, the next reclaimer to relcaim this memcg > > > > may be a different reclaimer, e.g. this memcg is also the root memcg of > > > > the new reclaimer, and then in mem_cgroup_protection() in get_scan_count() > > > > the old values of them will be used to calculate scan count, that is not > > > > proper. We should reset them to zero in this case. > > > > > > > > Here's an example of this issue. > > > > > > > > root_mem_cgroup > > > > / > > > > A memory.max=1024M memory.min=512M memory.current=800M > > > > > > > > Once kswapd is waked up, it will try to scan all MEMCGs, including > > > > this A, and it will assign memory.emin of A with 512M. > > > > After that, A may reach its hard limit(memory.max), and then it will > > > > do memcg reclaim. Because A is the root of this reclaimer, so it will > > > > not calculate its memory.emin. So the memory.emin is the old value > > > > 512M, and then this old value will be used in > > > > mem_cgroup_protection() in get_scan_count() to get the scan count. > > > > That is not proper. > > > > > > > > Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") > > > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > > > Cc: Chris Down <chris@xxxxxxxxxxxxxx> > > > > Cc: Roman Gushchin <guro@xxxxxx> > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > --- > > > > mm/memcontrol.c | 11 ++++++++++- > > > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > index 601405b..bb3925d 100644 > > > > --- a/mm/memcontrol.c > > > > +++ b/mm/memcontrol.c > > > > @@ -6287,8 +6287,17 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, > > > > > > > > if (!root) > > > > root = root_mem_cgroup; > > > > - if (memcg == root) > > > > + if (memcg == root) { > > > > + /* > > > > + * Reset memory.(emin, elow) for reclaiming the memcg > > > > + * itself. > > > > + */ > > > > + if (memcg != root_mem_cgroup) { > > > > + memcg->memory.emin = 0; > > > > + memcg->memory.elow = 0; > > > > + } > > > > > > I'm sorry, that didn't bring it from scratch, but I doubt that zeroing effecting > > > protection is correct. Imagine a simple config: a large cgroup subtree with memory.max > > > set on the top level. Reaching this limit doesn't mean that all protection > > > configuration inside the tree can be ignored. > > > > > > > No, they won't be ignored. > > Pls. see the logic in mem_cgroup_protected(), it will re-calculate all > > its children's effective min and low. > > Ah, you're right. I forgot about this > if (parent == root) > goto exit; > > which saves elow/emin from being truncated to 0. Sorry. > > Please, feel free to add > Acked-by: Roman Gushchin <guro@xxxxxx> > Thanks for your review. Thanks Yafang