Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 14 Jul 2020, Yafang Shao wrote:

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1962232..15e0e18 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1560,15 +1560,21 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  		.gfp_mask = gfp_mask,
>  		.order = order,
>  	};
> -	bool ret;
> +	bool ret = true;
>  
>  	if (mutex_lock_killable(&oom_lock))
>  		return true;
> +
> +	if (mem_cgroup_margin(memcg) >= (1 << order))
> +		goto unlock;
> +
>  	/*
>  	 * A few threads which were not waiting at mutex_lock_killable() can
>  	 * fail to bail out. Therefore, check again after holding oom_lock.
>  	 */
>  	ret = should_force_charge() || out_of_memory(&oc);
> +
> +unlock:
>  	mutex_unlock(&oom_lock);
>  	return ret;
>  }

Hi Yafang,

We've run with a patch very much like this for several years and it works 
quite successfully to prevent the unnecessary oom killing of processes.

We do this in out_of_memory() directly, however, because we found that we 
could prevent even *more* unnecessary killing if we checked this at the 
"point of no return" because the selection of processes takes some 
additional time when we might resolve the oom condition.

Some may argue that this is unnecessarily exposing mem_cgroup_margin() to 
generic mm code, but in the interest of preventing any unnecessary oom 
kill we've found it to be helpful.

I proposed a variant of this in https://lkml.org/lkml/2020/3/11/1089.

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -798,6 +798,8 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm,
 void mem_cgroup_split_huge_fixup(struct page *head);
 #endif
 
+unsigned long mem_cgroup_margin(struct mem_cgroup *memcg);
+
 #else /* CONFIG_MEMCG */
 
 #define MEM_CGROUP_ID_SHIFT	0
@@ -825,6 +827,10 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm,
 {
 }
 
+static inline unsigned long mem_cgroup_margin(struct mem_cgroup *memcg)
+{
+}
+
 static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1282,7 +1282,7 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
  * Returns the maximum amount of memory @mem can be charged with, in
  * pages.
  */
-static unsigned long mem_cgroup_margin(struct mem_cgroup *memcg)
+unsigned long mem_cgroup_margin(struct mem_cgroup *memcg)
 {
 	unsigned long margin = 0;
 	unsigned long count;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1109,9 +1109,23 @@ bool out_of_memory(struct oom_control *oc)
 		if (!is_sysrq_oom(oc) && !is_memcg_oom(oc))
 			panic("System is deadlocked on memory\n");
 	}
-	if (oc->chosen && oc->chosen != (void *)-1UL)
+	if (oc->chosen && oc->chosen != (void *)-1UL) {
+		if (is_memcg_oom(oc)) {
+			/*
+			 * If a memcg is now under its limit or current will be
+			 * exiting and freeing memory, avoid needlessly killing
+			 * chosen.
+			 */
+			if (mem_cgroup_margin(oc->memcg) >= (1 << oc->order) ||
+			    task_will_free_mem(current)) {
+				put_task_struct(oc->chosen);
+				return true;
+			}
+		}
+
 		oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
 				 "Memory cgroup out of memory");
+	}
 	return !!oc->chosen;
 }
 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux