On Fri, Dec 3, 2021 at 8:24 AM Dan Schatzberg <schatzberg.dan@xxxxxxxxx> wrote: > > Our container agent wants to know when a container exits if it was OOM > killed or not to report to the user. We use memory.oom.group = 1 to > ensure that OOM kills within the container's cgroup kill > everything. Existing memory.events are insufficient for knowing if > this triggered: > > 1) Our current approach reads memory.events oom_kill and reports the > container was killed if the value is non-zero. This is erroneous in > some cases where containers create their children cgroups with > memory.oom.group=1 as such OOM kills will get counted against the > container cgroup's oom_kill counter despite not actually OOM killing > the entire container. > > 2) Reading memory.events.local will fail to identify OOM kills in leaf > cgroups (that don't set memory.oom.group) within the container cgroup. > > This patch adds a new oom_group_kill event when memory.oom.group > triggers to allow userspace to cleanly identify when an entire cgroup > is oom killed. > > Signed-off-by: Dan Schatzberg <schatzberg.dan@xxxxxxxxx> So, with this patch, will you be watching oom_group_kill from memory.events or memory.events.local file for your use-case? Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>