When a memcg is destroyed, it won't be imediately released until all objects are gone. This means that if a memcg is restarted with the very same workload - a very common case, the objects already cached won't be billed to the new memcg. This is mostly undesirable since a container can exploit this by restarting itself every time it reaches its limit, and then coming up again with a fresh new limit. Since now we have targeted reclaim, I sustain that we should assume that a memcg that is destroyed should be flushed away. It makes perfect sense if we assume that a memcg that goes away most likely indicates an isolated workload that is terminated. Signed-off-by: Glauber Costa <glommer@xxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> --- mm/memcontrol.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e2dc89c..90173bc 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6310,10 +6310,27 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss) static void kmem_cgroup_css_offline(struct mem_cgroup *memcg) { + int ret; if (!memcg_kmem_is_active(memcg)) return; /* + * When a memcg is destroyed, it won't be imediately released until all + * objects are gone. This means that if a memcg is restarted with the + * very same workload - a very common case, the objects already cached + * won't be billed to the new memcg. This is mostly undesirable since a + * container can exploit this by restarting itself every time it + * reaches its limit, and then coming up again with a fresh new limit. + * + * Therefore a memcg that is destroyed should be flushed away. It makes + * perfect sense if we assume that a memcg that goes away indicates an + * isolated workload that is terminated. + */ + do { + ret = try_to_free_mem_cgroup_kmem(memcg, GFP_KERNEL); + } while (ret); + + /* * kmem charges can outlive the cgroup. In the case of slab * pages, for instance, a page contain objects from various * processes. As we prevent from taking a reference for every -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html