+ memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Thu, 03 Jan 2019 13:12:02 -0800

The patch titled
     Subject: memcg: schedule high reclaim for remote memcgs on high_work
has been added to the -mm tree.  Its filename is
     memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Shakeel Butt <shakeelb@xxxxxxxxxx>
Subject: memcg: schedule high reclaim for remote memcgs on high_work

If a memcg is over high limit, memory reclaim is scheduled to run on
return-to-userland.  However it is assumed that the memcg is the current
process's memcg.  With remote memcg charging for kmem or swapping in a
page charged to remote memcg, current process can trigger reclaim on
remote memcg.  So, scheduling reclaim on return-to-userland for remote
memcgs will ignore the high reclaim altogether.  So, punt the high reclaim
of remote memcgs to high_work.

Link: http://lkml.kernel.org/r/20190103015638.205424-1-shakeelb@xxxxxxxxxx
Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/memcontrol.c |   20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

--- a/mm/memcontrol.c~memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work
+++ a/mm/memcontrol.c
@@ -2318,19 +2318,23 @@ done_restock:
 	 * reclaim on returning to userland.  We can perform reclaim here
 	 * if __GFP_RECLAIM but let's always punt for simplicity and so that
 	 * GFP_KERNEL can consistently be used during reclaim.  @memcg is
-	 * not recorded as it most likely matches current's and won't
-	 * change in the meantime.  As high limit is checked again before
-	 * reclaim, the cost of mismatch is negligible.
+	 * not recorded as the return-to-userland high reclaim will only reclaim
+	 * from current's memcg (or its ancestor). For other memcgs we punt them
+	 * to work queue.
 	 */
 	do {
 		if (page_counter_read(&memcg->memory) > memcg->high) {
-			/* Don't bother a random interrupted task */
-			if (in_interrupt()) {
+			/*
+			 * Don't bother a random interrupted task or if the
+			 * memcg is not current's memcg's ancestor.
+			 */
+			if (in_interrupt() ||
+			    !mm_match_cgroup(current->mm, memcg)) {
 				schedule_work(&memcg->high_work);
-				break;
+			} else {
+				current->memcg_nr_pages_over_high += batch;
+				set_notify_resume(current);
 			}
-			current->memcg_nr_pages_over_high += batch;
-			set_notify_resume(current);
 			break;
 		}
 	} while ((memcg = parent_mem_cgroup(memcg)));
_

Patches currently in -mm which might be from shakeelb@xxxxxxxxxx are

fork-memcg-fix-cached_stacks-case.patch
memcg-localize-memcg_kmem_enabled-check.patch
memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch