On Tue 03-10-17 15:38:08, Roman Gushchin wrote: > On Tue, Oct 03, 2017 at 04:22:46PM +0200, Michal Hocko wrote: > > On Tue 03-10-17 15:08:41, Roman Gushchin wrote: > > > On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote: > > [...] > > > > I guess we want to inherit the value on the memcg creation but I agree > > > > that enforcing parent setting is weird. I will think about it some more > > > > but I agree that it is saner to only enforce per memcg value. > > > > > > I'm not against, but we should come up with a good explanation, why we're > > > inheriting it; or not inherit. > > > > Inheriting sounds like a less surprising behavior. Once you opt in for > > oom_group you can expect that descendants are going to assume the same > > unless they explicitly state otherwise. > > > > [...] > > > > > > > @@ -962,6 +968,48 @@ static void oom_kill_process(struct oom_control *oc, const char *message) > > > > > > > __oom_kill_process(victim); > > > > > > > } > > > > > > > > > > > > > > +static int oom_kill_memcg_member(struct task_struct *task, void *unused) > > > > > > > +{ > > > > > > > + if (!tsk_is_oom_victim(task)) { > > > > > > > > > > > > How can this happen? > > > > > > > > > > We do start with killing the largest process, and then iterate over all tasks > > > > > in the cgroup. So, this check is required to avoid killing tasks which are > > > > > already in the termination process. > > > > > > > > Do you mean we have tsk_is_oom_victim && MMF_OOM_SKIP == T? > > > > > > No, just tsk_is_oom_victim. We're are killing the biggest task, and then _all_ > > > tasks. This is a way to skip the biggest task, and do not kill it again. > > > > OK, I have missed that part. Why are we doing that actually? Why don't > > we simply do > > /* If oom_group flag is set, kill all belonging tasks */ > > if (mem_cgroup_oom_group(oc->chosen_memcg)) > > mem_cgroup_scan_tasks(oc->chosen_memcg, oom_kill_memcg_member, > > NULL); > > > > we are going to kill all the tasks anyway. > > Well, the idea behind was that killing the biggest process give us better > chances to get out of global memory shortage and guarantee forward progress. > I can drop it, if it considered to be excessive. Yes, please do so. If we need it then we can do that in a separate patch along with the explanation why it is needed. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html