+ memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: memcg: schedule high reclaim for remote memcgs on high_work
has been added to the -mm tree.  Its filename is
     memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Shakeel Butt <shakeelb@xxxxxxxxxx>
Subject: memcg: schedule high reclaim for remote memcgs on high_work

If a memcg is over high limit, memory reclaim is scheduled to run on
return-to-userland.  However it is assumed that the memcg is the current
process's memcg.  With remote memcg charging for kmem or swapping in a
page charged to remote memcg, current process can trigger reclaim on
remote memcg.  So, schduling reclaim on return-to-userland for remote
memcgs will ignore the high reclaim altogether.  So, record the memcg
needing high reclaim and trigger high reclaim for that memcg on
return-to-userland.  However if the memcg is already recorded for high
reclaim and the recorded memcg is not the descendant of the the memcg
needing high reclaim, punt the high reclaim to the work queue.

Link: http://lkml.kernel.org/r/20190108200538.80371-1-shakeelb@xxxxxxxxxx
Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/sched.h |    3 +++
 kernel/fork.c         |    1 +
 mm/memcontrol.c       |   18 +++++++++++++-----
 3 files changed, 17 insertions(+), 5 deletions(-)

--- a/include/linux/sched.h~memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work
+++ a/include/linux/sched.h
@@ -1172,6 +1172,9 @@ struct task_struct {
 
 	/* Used by memcontrol for targeted memcg charge: */
 	struct mem_cgroup		*active_memcg;
+
+	/* Used by memcontrol for high relcaim: */
+	struct mem_cgroup		*memcg_high_reclaim;
 #endif
 
 #ifdef CONFIG_BLK_CGROUP
--- a/kernel/fork.c~memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work
+++ a/kernel/fork.c
@@ -919,6 +919,7 @@ static struct task_struct *dup_task_stru
 
 #ifdef CONFIG_MEMCG
 	tsk->active_memcg = NULL;
+	tsk->memcg_high_reclaim = NULL;
 #endif
 	return tsk;
 
--- a/mm/memcontrol.c~memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work
+++ a/mm/memcontrol.c
@@ -2161,7 +2161,8 @@ void mem_cgroup_handle_over_high(void)
 	if (likely(!nr_pages))
 		return;
 
-	memcg = get_mem_cgroup_from_mm(current->mm);
+	memcg = current->memcg_high_reclaim;
+	current->memcg_high_reclaim = NULL;
 	reclaim_high(memcg, nr_pages, GFP_KERNEL);
 	css_put(&memcg->css);
 	current->memcg_nr_pages_over_high = 0;
@@ -2317,10 +2318,10 @@ done_restock:
 	 * If the hierarchy is above the normal consumption range, schedule
 	 * reclaim on returning to userland.  We can perform reclaim here
 	 * if __GFP_RECLAIM but let's always punt for simplicity and so that
-	 * GFP_KERNEL can consistently be used during reclaim.  @memcg is
-	 * not recorded as it most likely matches current's and won't
-	 * change in the meantime.  As high limit is checked again before
-	 * reclaim, the cost of mismatch is negligible.
+	 * GFP_KERNEL can consistently be used during reclaim. Record the memcg
+	 * for the return-to-userland high reclaim. If the memcg is already
+	 * recorded and the recorded memcg is not the descendant of the memcg
+	 * needing high reclaim, punt the high reclaim to the work queue.
 	 */
 	do {
 		if (page_counter_read(&memcg->memory) > memcg->high) {
@@ -2328,6 +2329,13 @@ done_restock:
 			if (in_interrupt()) {
 				schedule_work(&memcg->high_work);
 				break;
+			} else if (!current->memcg_high_reclaim) {
+				css_get(&memcg->css);
+				current->memcg_high_reclaim = memcg;
+			} else if (!mem_cgroup_is_descendant(
+					current->memcg_high_reclaim, memcg)) {
+				schedule_work(&memcg->high_work);
+				break;
 			}
 			current->memcg_nr_pages_over_high += batch;
 			set_notify_resume(current);
_

Patches currently in -mm which might be from shakeelb@xxxxxxxxxx are

fork-memcg-fix-cached_stacks-case.patch
memcg-localize-memcg_kmem_enabled-check.patch
memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux