On Tue 17-03-20 11:25:52, Robert Kolchmeyer wrote: > On Tue, Mar 10, 2020 at 3:54 PM David Rientjes <rientjes@xxxxxxxxxx> wrote: > > > > Robert, could you elaborate on the user-visible effects of this issue that > > caused it to initially get reported? > > > > Ami (now cc'ed) knows more, but here is my understanding. The use case > involves a Docker container running multiple processes. The container > has a memory limit set. The container contains two long-lived, > important processes p1 and p2, and some arbitrary, dynamic number of > usually ephemeral processes p3,...,pn. These processes are structured > in a hierarchy that looks like p1->p2->[p3,...,pn]; p1 is a parent of > p2, and p2 is the parent for all of the ephemeral processes p3,...,pn. > > Since p1 and p2 are long-lived and important, the user does not want > p1 and p2 to be oom-killed. However, p3,...,pn are expected to use a > lot of memory, and it's ok for those processes to be oom-killed. > > If the user sets oom_score_adj on p1 and p2 to make them very unlikely > to be oom-killed, p3,...,pn will inherit the oom_score_adj value, > which is bad. Additionally, setting oom_score_adj on p3,...,pn is > tricky, since processes in the Docker container (specifically p1 and > p2) don't have permissions to set oom_score_adj on p3,...,pn. The > ephemeral nature of p3,...,pn also makes setting oom_score_adj on them > tricky after they launch. Thanks for the clarification. > So, the user hopes that when one of p3,...,pn triggers an oom > condition in the Docker container, the oom killer will almost always > kill processes from p3,...,pn (and not kill p1 or p2, which are both > important and unlikely to trigger an oom condition). The issue of more > processes being killed than are strictly necessary is resulting in p1 > or p2 being killed much more frequently when one of p3,...,pn triggers > an oom condition, and p1 or p2 being killed is very disruptive for the > user (my understanding is that p1 or p2 going down with high frequency > results in significant unhealthiness in the user's service). Do you have any logs showing this condition? I am interested because from your description it seems like p1/p2 shouldn't be usually those which trigger the oom, right? That suggests that it should be mostly p3, ... pn to be in the kernel triggering the oom and therefore they shouldn't vanish. -- Michal Hocko SUSE Labs