David Rientjes wrote: > Your proposal, which I mostly agree with, tries to kill additional > processes so that they allocate and drop the lock that the original victim > depends on. My approach, from > http://marc.info/?l=linux-kernel&m=144010444913702, is the same, but > without the killing. It's unecessary to kill every process on the system > that is depending on the same lock, and we can't know which processes are > stalling on that lock and which are not. Would you try your approach with below program? (My reproducers are tested on XFS on a VM with 4 CPUs / 2048MB RAM.) ---------- oom-depleter3.c start ---------- #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sched.h> static int zero_fd = EOF; static char *buf = NULL; static unsigned long size = 0; static int dummy(void *unused) { static char buffer[4096] = { }; int fd = open("/tmp/file", O_WRONLY | O_CREAT | O_APPEND, 0600); while (write(fd, buffer, sizeof(buffer) == sizeof(buffer)) && fsync(fd) == 0); return 0; } static int trigger(void *unused) { read(zero_fd, buf, size); /* Will cause OOM due to overcommit */ return 0; } int main(int argc, char *argv[]) { unsigned long i; zero_fd = open("/dev/zero", O_RDONLY); for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) { char *cp = realloc(buf, size); if (!cp) { size >>= 1; break; } buf = cp; } /* * Create many child threads in order to enlarge time lag between * the OOM killer sets TIF_MEMDIE to thread group leader and * the OOM killer sends SIGKILL to that thread. */ for (i = 0; i < 1000; i++) { clone(dummy, malloc(1024) + 1024, CLONE_SIGHAND | CLONE_VM, NULL); } /* Let a child thread trigger the OOM killer. */ clone(trigger, malloc(4096)+ 4096, CLONE_SIGHAND | CLONE_VM, NULL); /* Deplete all memory reserve using the time lag. */ for (i = size; i; i -= 4096) buf[i - 1] = 1; return * (char *) NULL; /* Kill all threads. */ } ---------- oom-depleter3.c end ---------- uptime > 350 of http://I-love.SAKURA.ne.jp/tmp/serial-20150922-1.txt.xz shows that the memory reserves completely depleted and uptime > 42 of http://I-love.SAKURA.ne.jp/tmp/serial-20150922-2.txt.xz shows that the memory reserves was not used at all. Is this result what you expected? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>