On 2019/06/16 16:37, Tetsuo Handa wrote: > On 2019/06/16 6:33, Tetsuo Handa wrote: >> On 2019/06/16 3:50, Shakeel Butt wrote: >>>> While dump_tasks() traverses only each thread group, mem_cgroup_scan_tasks() >>>> traverses each thread. >>> >>> I think mem_cgroup_scan_tasks() traversing threads is not intentional >>> and css_task_iter_start in it should use CSS_TASK_ITER_PROCS as the >>> oom killer only cares about the processes or more specifically >>> mm_struct (though two different thread groups can have same mm_struct >>> but that is fine). >> >> We can't use CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks(). I've tried >> CSS_TASK_ITER_PROCS in an attempt to evaluate only one thread from each >> thread group, but I found that CSS_TASK_ITER_PROCS causes skipping whole >> threads in a thread group (and trivially allowing "Out of memory and no >> killable processes...\n" flood) if thread group leader has already exited. > > Seems that CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks() is now working. I found a reproducer and the commit. ---------------------------------------- #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <sched.h> #include <sys/mman.h> #include <asm/unistd.h> static const unsigned long size = 1048576 * 200; static int thread(void *unused) { int fd = open("/dev/zero", O_RDONLY); char *buf = mmap(NULL, size, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_SHARED, EOF, 0); sleep(1); read(fd, buf, size); return syscall(__NR_exit, 0); } int main(int argc, char *argv[]) { FILE *fp; mkdir("/sys/fs/cgroup/memory/test1", 0755); fp = fopen("/sys/fs/cgroup/memory/test1/memory.limit_in_bytes", "w"); fprintf(fp, "%lu\n", size); fclose(fp); fp = fopen("/sys/fs/cgroup/memory/test1/tasks", "w"); fprintf(fp, "%u\n", getpid()); fclose(fp); clone(thread, malloc(8192) + 4096, CLONE_SIGHAND | CLONE_THREAD | CLONE_VM, NULL); return syscall(__NR_exit, 0); } ---------------------------------------- Here is a patch to use CSS_TASK_ITER_PROCS. >From 415e52cf55bc4ad931e4f005421b827f0b02693d Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Date: Mon, 17 Jun 2019 00:09:38 +0900 Subject: [PATCH] mm: memcontrol: Use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks(). Since commit c03cd7738a83b137 ("cgroup: Include dying leaders with live threads in PROCS iterations") corrected how CSS_TASK_ITER_PROCS works, mem_cgroup_scan_tasks() can use CSS_TASK_ITER_PROCS in order to check only one thread from each thread group. Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ba9138a..b09ff45 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1163,7 +1163,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, struct css_task_iter it; struct task_struct *task; - css_task_iter_start(&iter->css, 0, &it); + css_task_iter_start(&iter->css, CSS_TASK_ITER_PROCS, &it); while (!ret && (task = css_task_iter_next(&it))) ret = fn(task, arg); css_task_iter_end(&it); -- 1.8.3.1