On Sun, Jun 16, 2019 at 8:14 AM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2019/06/16 16:37, Tetsuo Handa wrote: > > On 2019/06/16 6:33, Tetsuo Handa wrote: > >> On 2019/06/16 3:50, Shakeel Butt wrote: > >>>> While dump_tasks() traverses only each thread group, mem_cgroup_scan_tasks() > >>>> traverses each thread. > >>> > >>> I think mem_cgroup_scan_tasks() traversing threads is not intentional > >>> and css_task_iter_start in it should use CSS_TASK_ITER_PROCS as the > >>> oom killer only cares about the processes or more specifically > >>> mm_struct (though two different thread groups can have same mm_struct > >>> but that is fine). > >> > >> We can't use CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks(). I've tried > >> CSS_TASK_ITER_PROCS in an attempt to evaluate only one thread from each > >> thread group, but I found that CSS_TASK_ITER_PROCS causes skipping whole > >> threads in a thread group (and trivially allowing "Out of memory and no > >> killable processes...\n" flood) if thread group leader has already exited. > > > > Seems that CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks() is now working. > > > I found a reproducer and the commit. > > ---------------------------------------- > #define _GNU_SOURCE > #include <stdio.h> > #include <stdlib.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <unistd.h> > #include <sched.h> > #include <sys/mman.h> > #include <asm/unistd.h> > > static const unsigned long size = 1048576 * 200; > static int thread(void *unused) > { > int fd = open("/dev/zero", O_RDONLY); > char *buf = mmap(NULL, size, PROT_WRITE | PROT_READ, > MAP_ANONYMOUS | MAP_SHARED, EOF, 0); > sleep(1); > read(fd, buf, size); > return syscall(__NR_exit, 0); > } > int main(int argc, char *argv[]) > { > FILE *fp; > mkdir("/sys/fs/cgroup/memory/test1", 0755); > fp = fopen("/sys/fs/cgroup/memory/test1/memory.limit_in_bytes", "w"); > fprintf(fp, "%lu\n", size); > fclose(fp); > fp = fopen("/sys/fs/cgroup/memory/test1/tasks", "w"); > fprintf(fp, "%u\n", getpid()); > fclose(fp); > clone(thread, malloc(8192) + 4096, CLONE_SIGHAND | CLONE_THREAD | CLONE_VM, NULL); > return syscall(__NR_exit, 0); > } > ---------------------------------------- > > Here is a patch to use CSS_TASK_ITER_PROCS. > > From 415e52cf55bc4ad931e4f005421b827f0b02693d Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > Date: Mon, 17 Jun 2019 00:09:38 +0900 > Subject: [PATCH] mm: memcontrol: Use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks(). > > Since commit c03cd7738a83b137 ("cgroup: Include dying leaders with live > threads in PROCS iterations") corrected how CSS_TASK_ITER_PROCS works, > mem_cgroup_scan_tasks() can use CSS_TASK_ITER_PROCS in order to check > only one thread from each thread group. > > Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Why not add the reproducer in the commit message? > --- > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ba9138a..b09ff45 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1163,7 +1163,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, > struct css_task_iter it; > struct task_struct *task; > > - css_task_iter_start(&iter->css, 0, &it); > + css_task_iter_start(&iter->css, CSS_TASK_ITER_PROCS, &it); > while (!ret && (task = css_task_iter_next(&it))) > ret = fn(task, arg); > css_task_iter_end(&it); > -- > 1.8.3.1