On Mon 16-03-20 15:35:10, Roman Gushchin wrote: > If a task is getting moved out of the OOMing cgroup, it might > result in unexpected OOM killings if memory.oom.group is used > anywhere in the cgroup tree. > > Imagine the following example: > > A (oom.group = 1) > / \ > (OOM) B C > > Let's say B's memory.max is exceeded and it's OOMing. The OOM killer > selects a task in B as a victim, but someone asynchronously moves > the task into C. mem_cgroup_get_oom_group() will iterate over all > ancestors of C up to the root cgroup. In theory it had to stop > at the oom_domain level - the memory cgroup which is OOMing. > But because B is not an ancestor of C, it's not happening. > Instead it chooses A (because it's oom.group is set), and kills > all tasks in A. This behavior is wrong because the OOM happened in B, > so there is no reason to kill anything outside. > > Fix this by checking it the memory cgroup to which the task belongs > is a descendant of the oom_domain. If not, memory.oom.group should > be ignored, and the OOM killer should kill only the victim task. > > Signed-off-by: Roman Gushchin <guro@xxxxxx> > Reported-by: Dan Schatzberg <dschatzberg@xxxxxx> After the follow up discussion I do agree that this should be sufficient for now. Acked-by: Michal Hocko <mhocko@xxxxxxxx> > --- > mm/memcontrol.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index daa399be4688..d8c4b7aa4e73 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1930,6 +1930,14 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim, > if (memcg == root_mem_cgroup) > goto out; > > + /* > + * If the victim task has been asynchronously moved to a different > + * memory cgroup, we might end up killing tasks outside oom_domain. > + * In this case it's better to ignore memory.group.oom. > + */ > + if (unlikely(!mem_cgroup_is_descendant(memcg, oom_domain))) > + goto out; > + > /* > * Traverse the memory cgroup hierarchy from the victim task's > * cgroup up to the OOMing cgroup (or root) to find the > -- > 2.24.1 -- Michal Hocko SUSE Labs