On 2019/01/11 8:59, Tetsuo Handa wrote: > Michal Hocko wrote: >> On Wed 09-01-19 20:34:46, Tetsuo Handa wrote: >>> On 2019/01/09 20:03, Michal Hocko wrote: >>>> Tetsuo, >>>> can you confirm that these two patches are fixing the issue you have >>>> reported please? >>>> >>> >>> My patch fixes the issue better than your "[PATCH 2/2] memcg: do not >>> report racy no-eligible OOM tasks" does. >> >> OK, so we are stuck again. Hooray! > > Andrew, will you pick up "[PATCH 3/2] memcg: Facilitate termination of memcg OOM victims." ? > Since mm-oom-marks-all-killed-tasks-as-oom-victims.patch does not call mark_oom_victim() > when task_will_free_mem() == true, memcg-do-not-report-racy-no-eligible-oom-tasks.patch > does not close the race whereas my patch closes the race better. > I confirmed that mm-oom-marks-all-killed-tasks-as-oom-victims.patch and memcg-do-not-report-racy-no-eligible-oom-tasks.patch are completely failing to fix the issue I am reporting. :-( Reproducer: ---------- #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <sched.h> #include <sys/mman.h> #define NUMTHREADS 256 #define MMAPSIZE 4 * 10485760 #define STACKSIZE 4096 static int pipe_fd[2] = { EOF, EOF }; static int memory_eater(void *unused) { int fd = open("/dev/zero", O_RDONLY); char *buf = mmap(NULL, MMAPSIZE, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_SHARED, EOF, 0); read(pipe_fd[0], buf, 1); read(fd, buf, MMAPSIZE); pause(); return 0; } int main(int argc, char *argv[]) { int i; char *stack; FILE *fp; const unsigned long size = 1048576UL * 200; mkdir("/sys/fs/cgroup/memory/test1", 0755); fp = fopen("/sys/fs/cgroup/memory/test1/memory.limit_in_bytes", "w"); fprintf(fp, "%lu\n", size); fclose(fp); fp = fopen("/sys/fs/cgroup/memory/test1/tasks", "w"); fprintf(fp, "%u\n", getpid()); fclose(fp); if (setgid(-2) || setuid(-2) || pipe(pipe_fd)) return 1; stack = mmap(NULL, STACKSIZE * NUMTHREADS, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_SHARED, EOF, 0); for (i = 0; i < NUMTHREADS; i++) if (clone(memory_eater, stack + (i + 1) * STACKSIZE, CLONE_VM | CLONE_FS | CLONE_FILES, NULL) == -1) break; close(pipe_fd[1]); pause(); // Manually enter Ctrl-C immediately after dump_header() started. return 0; } ---------- Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20190111.txt.xz : ---------- [ 71.146532][ T9694] a.out invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 [ 71.151647][ T9694] CPU: 1 PID: 9694 Comm: a.out Kdump: loaded Not tainted 5.0.0-rc1-next-20190111 #272 (...snipped...) [ 71.304689][ T9694] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/test1,task_memcg=/test1,task=a.out,pid=9692,uid=-2 [ 71.304703][ T9694] Memory cgroup out of memory: Kill process 9692 (a.out) score 904 or sacrifice child [ 71.309149][ T54] oom_reaper: reaped process 9750 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:185532kB [ 71.328523][ T9748] a.out invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0 [ 71.328552][ T9748] CPU: 4 PID: 9748 Comm: a.out Kdump: loaded Not tainted 5.0.0-rc1-next-20190111 #272 (...snipped...) [ 71.328785][ T9748] Out of memory and no killable processes... [ 71.329194][ T9771] a.out invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0 (...snipped...) [ 99.696592][ T9924] Out of memory and no killable processes... [ 99.699001][ T9838] a.out invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 (...snipped...) [ 99.833413][ T9838] Out of memory and no killable processes... ---------- $ grep -F 'Out of memory and no killable processes...' serial-20190111.txt | wc -l 213