On Thu, Apr 05, 2018 at 03:45:26PM -0400, Johannes Weiner wrote: > On Thu, Apr 05, 2018 at 07:59:20PM +0100, Roman Gushchin wrote: > > If memcg's usage is equal to the memory.low value, avoid reclaiming > > from this cgroup while there is a surplus of reclaimable memory. > > > > This sounds more logical and also matches memory.high and memory.max > > behavior: both are inclusive. > > I was trying to figure out why we did it this way in the first place > and found this patch: > > commit 4e54dede38b45052a941bcf709f7d29f2e18174d > Author: Michal Hocko <mhocko@xxxxxxx> > Date: Fri Feb 27 15:51:46 2015 -0800 > > memcg: fix low limit calculation > > A memcg is considered low limited even when the current usage is equal to > the low limit. This leads to interesting side effects e.g. > groups/hierarchies with no memory accounted are considered protected and > so the reclaim will emit MEMCG_LOW event when encountering them. > > Another and much bigger issue was reported by Joonsoo Kim. He has hit a > NULL ptr dereference with the legacy cgroup API which even doesn't have > low limit exposed. The limit is 0 by default but the initial check fails > for memcg with 0 consumption and parent_mem_cgroup() would return NULL if > use_hierarchy is 0 and so page_counter_read would try to dereference NULL. > > I suppose that the current implementation is just an overlook because the > documentation in Documentation/cgroups/unified-hierarchy.txt says: > > "The memory.low boundary on the other hand is a top-down allocated > reserve. A cgroup enjoys reclaim protection when it and all its > ancestors are below their low boundaries" > > Fix the usage and the low limit comparision in mem_cgroup_low accordingly. > > > @@ -5709,7 +5709,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) > > elow = min(elow, parent_elow * low_usage / siblings_low_usage); > > exit: > > memcg->memory.elow = elow; > > - return usage < elow; > > + return usage <= elow; > > So I think this needs to be usage && usage <= elow to not emit > MEMCG_LOW events in case usage == elow == 0. Perfect catch! Thanks, Johannes! Updated version below. -- >From 466c35c36cae392cfee5e54a2884792972e789ee Mon Sep 17 00:00:00 2001 From: Roman Gushchin <guro@xxxxxx> Date: Thu, 5 Apr 2018 19:31:35 +0100 Subject: [PATCH v4 3/4] mm: treat memory.low value inclusive If memcg's usage is equal to the memory.low value, avoid reclaiming from this cgroup while there is a surplus of reclaimable memory. This sounds more logical and also matches memory.high and memory.max behavior: both are inclusive. Empty cgroups are not considered protected, so MEMCG_LOW events are not emitted for empty cgroups, if there is no more reclaimable memory in the system. Signed-off-by: Roman Gushchin <guro@xxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: kernel-team@xxxxxx Cc: linux-mm@xxxxxxxxx Cc: cgroups@xxxxxxxxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx --- mm/memcontrol.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 78cf21f2a943..3d039fa1a8f5 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5608,14 +5608,14 @@ struct cgroup_subsys memory_cgrp_subsys = { }; /** - * mem_cgroup_low - check if memory consumption is below the normal range + * mem_cgroup_low - check if memory consumption is in the normal range * @root: the top ancestor of the sub-tree being checked * @memcg: the memory cgroup to check * * WARNING: This function is not stateless! It can only be used as part * of a top-down tree iteration, not for isolated queries. * - * Returns %true if memory consumption of @memcg is below the normal range. + * Returns %true if memory consumption of @memcg is in the normal range. * * @root is exclusive; it is never low when looked at directly * @@ -5709,7 +5709,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) elow = min(elow, parent_elow * low_usage / siblings_low_usage); exit: memcg->memory.elow = elow; - return usage < elow; + return usage && usage <= elow; } /** -- 2.14.3