On Thu 22-08-19 04:56:29, Yafang Shao wrote: > - Why we need a per memcg oom_score_adj setting ? > This is easy to deploy and very convenient for container. > When we use container, we always treat memcg as a whole, if we have a per > memcg oom_score_adj setting we don't need to set it process by process. Why cannot an initial process in the cgroup set the oom_score_adj and other processes just inherit it from there? This sounds trivial to do with a startup script. > It will make the user exhausted to set it to all processes in a memcg. Then let's have scripts to set it as they are less prone to exhaustion ;) But seriously > In this patch, a file named memory.oom.score_adj is introduced. > The valid value of it is from -1000 to +1000, which is same with > process-level oom_score_adj. > When OOM is invoked, the effective oom_score_adj is as bellow, > effective oom_score_adj = original oom_score_adj + memory.oom.score_adj This doesn't make any sense to me. Say that process has oom_score_adj -1000 (never kill) then group oom_score_adj will simply break the expectation and the task becomes killable for any value but -1000. Why is summing up those values even sensible? > The valid effective value is also from -1000 to +1000. > This is something like a hook to re-calculate the oom_score_adj. Besides that. What is the hierarchical semantic? Say you have hierarchy A (oom_score_adj = 1000) \ B (oom_score_adj = 500) \ C (oom_score_adj = -1000) put the above summing up aside for now and just focus on the memcg adjusting? -- Michal Hocko SUSE Labs