On 2018/10/12 21:58, Tetsuo Handa wrote: > On 2018/10/12 21:41, Johannes Weiner wrote: >> On Fri, Oct 12, 2018 at 09:10:40PM +0900, Tetsuo Handa wrote: >>> On 2018/10/12 21:08, Michal Hocko wrote: >>>>> So not more than 10 dumps in each 5s interval. That looks reasonable >>>>> to me. By the time it starts dropping data you have more than enough >>>>> information to go on already. Not reasonable at all. >>>> >>>> Yeah. Unless we have a storm coming from many different cgroups in >>>> parallel. But even then we have the allocation context for each OOM so >>>> we are not losing everything. Should we ever tune this, it can be done >>>> later with some explicit examples. >>>> >>>>> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx> >>>> >>>> Thanks! I will post the patch to Andrew early next week. >>>> One thread from one cgroup is sufficient. I don't think that Michal's patch is an appropriate mitigation. It still needlessly floods kernel log buffer and significantly defers recovery operation. Nacked-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> ---------- Testcase ---------- #include <stdio.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> #include <stdlib.h> int main(int argc, char *argv[]) { FILE *fp; const unsigned long size = 1048576 * 200; char *buf = malloc(size); mkdir("/sys/fs/cgroup/memory/test1", 0755); fp = fopen("/sys/fs/cgroup/memory/test1/memory.limit_in_bytes", "w"); fprintf(fp, "%lu\n", size / 2); fclose(fp); fp = fopen("/sys/fs/cgroup/memory/test1/tasks", "w"); fprintf(fp, "%u\n", getpid()); fclose(fp); fp = fopen("/proc/self/oom_score_adj", "w"); fprintf(fp, "-1000\n"); fclose(fp); fp = fopen("/dev/zero", "r"); fread(buf, 1, size, fp); fclose(fp); return 0; } ---------- Michal's patch ---------- 73133 lines (5.79MB) of kernel messages per one run [root@ccsecurity ~]# time ./a.out real 3m44.389s user 0m0.000s sys 3m42.334s [root@ccsecurity ~]# time ./a.out real 3m41.767s user 0m0.004s sys 3m39.779s ---------- My v2 patch ---------- 50 lines (3.40 KB) of kernel messages per one run [root@ccsecurity ~]# time ./a.out real 0m5.227s user 0m0.000s sys 0m4.950s [root@ccsecurity ~]# time ./a.out real 0m5.249s user 0m0.000s sys 0m4.956s