On Wed 30-11-22 15:01:58, chengkaitao wrote: > From: chengkaitao <pilgrimtao@xxxxxxxxx> > > We created a new interface <memory.oom.protect> for memory, If there is > the OOM killer under parent memory cgroup, and the memory usage of a > child cgroup is within its effective oom.protect boundary, the cgroup's > tasks won't be OOM killed unless there is no unprotected tasks in other > children cgroups. It draws on the logic of <memory.min/low> in the > inheritance relationship. Could you be more specific about usecases? How do you tune oom.protect wrt to other tunables? How does this interact with the oom_score_adj tunining (e.g. a first hand oom victim with the score_adj 1000 sitting in a oom protected memcg)? I haven't really read through the whole patch but this struck me odd. > @@ -552,8 +552,19 @@ static int proc_oom_score(struct seq_file *m, struct pid_namespace *ns, > unsigned long totalpages = totalram_pages() + total_swap_pages; > unsigned long points = 0; > long badness; > +#ifdef CONFIG_MEMCG > + struct mem_cgroup *memcg; > > - badness = oom_badness(task, totalpages); > + rcu_read_lock(); > + memcg = mem_cgroup_from_task(task); > + if (memcg && !css_tryget(&memcg->css)) > + memcg = NULL; > + rcu_read_unlock(); > + > + update_parent_oom_protection(root_mem_cgroup, memcg); > + css_put(&memcg->css); > +#endif > + badness = oom_badness(task, totalpages, MEMCG_OOM_PROTECT); the badness means different thing depending on which memcg hierarchy subtree you look at. Scaling based on the global oom could get really misleading. -- Michal Hocko SUSE Labs