Setting a memory.high limit below the usage makes almost no effort to shrink the cgroup to the new target size. While memory.high is a "soft" limit that isn't supposed to cause OOM situations, we should still try harder to meet a user request through persistent reclaim. For example, after setting a 10M memory.high on an 800M cgroup full of file cache, the usage shrinks to about 350M: + cat /cgroup/workingset/memory.current 841568256 + echo 10M + cat /cgroup/workingset/memory.current 355729408 This isn't exactly what the user would expect to happen. Setting the value a few more times eventually whittles the usage down to what we are asking for: + echo 10M + cat /cgroup/workingset/memory.current 104181760 + echo 10M + cat /cgroup/workingset/memory.current 31801344 + echo 10M + cat /cgroup/workingset/memory.current 10440704 To improve this, add reclaim retry loops to the memory.high write() callback, similar to what we do for memory.max, to make a reasonable effort that the usage meets the requested size after the call returns. Afterwards, a single write() to memory.high is enough in all but extreme cases: + cat /cgroup/workingset/memory.current 841609216 + echo 10M + cat /cgroup/workingset/memory.current 10182656 Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> --- mm/memcontrol.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ff90d4e7df37..8090b4c99ac7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6074,7 +6074,8 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of)); - unsigned long nr_pages; + unsigned int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; + bool drained = false; unsigned long high; int err; @@ -6085,12 +6086,29 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, memcg->high = high; - nr_pages = page_counter_read(&memcg->memory); - if (nr_pages > high) - try_to_free_mem_cgroup_pages(memcg, nr_pages - high, - GFP_KERNEL, true); + for (;;) { + unsigned long nr_pages = page_counter_read(&memcg->memory); + unsigned long reclaimed; + + if (nr_pages <= high) + break; + + if (signal_pending(current)) + break; + + if (!drained) { + drain_all_stock(memcg); + drained = true; + continue; + } + + reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high, + GFP_KERNEL, true); + + if (!reclaimed && !nr_retries--) + break; + } - memcg_wb_domain_size_changed(memcg); return nbytes; } -- 2.23.0