Tetsuo Handa wrote: > Sorry. This was my misunderstanding. But I still think that we need to be > prepared for cases where zapping OOM victim's mm approach fails. > ( http://lkml.kernel.org/r/201509242050.EHE95837.FVFOOtMQHLJOFS@xxxxxxxxxxxxxxxxxxx ) I tested whether it is easy/difficult to make zapping OOM victim's mm approach fail. The result seems that not difficult to make it fail. ---------- Reproducer start ---------- #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sched.h> #include <sys/mman.h> static int reader(void *unused) { char c; int fd = open("/proc/self/cmdline", O_RDONLY); while (pread(fd, &c, 1, 0) == 1); return 0; } static int writer(void *unused) { const int fd = open("/proc/self/exe", O_RDONLY); static void *ptr[10000]; int i; sleep(2); while (1) { for (i = 0; i < 10000; i++) ptr[i] = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd, 0); for (i = 0; i < 10000; i++) munmap(ptr[i], 4096); } return 0; } int main(int argc, char *argv[]) { int zero_fd = open("/dev/zero", O_RDONLY); char *buf = NULL; unsigned long size = 0; int i; for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) { char *cp = realloc(buf, size); if (!cp) { size >>= 1; break; } buf = cp; } for (i = 0; i < 100; i++) { clone(reader, malloc(1024) + 1024, CLONE_THREAD | CLONE_SIGHAND | CLONE_VM, NULL); } clone(writer, malloc(1024) + 1024, CLONE_THREAD | CLONE_SIGHAND | CLONE_VM, NULL); read(zero_fd, buf, size); /* Will cause OOM due to overcommit */ return * (char *) NULL; /* Kill all threads. */ } ---------- Reproducer end ---------- (I wrote this program for trying to mimic a trouble that a customer's system hung up with a lot of ps processes blocked at reading /proc/pid/ entries due to unkillable down_read(&mm->mmap_sem) in __access_remote_vm(). Though I couldn't identify what function was holding the mmap_sem for writing...) Uptime > 429 of http://I-love.SAKURA.ne.jp/tmp/serial-20151006.txt.xz showed a OOM livelock that (1) thread group leader is blocked at down_read(&mm->mmap_sem) in exit_mm() called from do_exit(). (2) writer thread is blocked at down_write(&mm->mmap_sem) in vm_mmap_pgoff() called from SyS_mmap_pgoff() called from SyS_mmap(). (3) many reader threads are blocking the writer thread because of down_read(&mm->mmap_sem) called from proc_pid_cmdline_read(). (4) while the thread group leader is blocked at down_read(&mm->mmap_sem), some of the reader threads are trying to allocate memory via page fault. So, zapping the first OOM victim's mm might fail by chance. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>