On Mon, Jul 26, 2021 at 12:27 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > [...] > > Is process_mrelease on all of them really necessary? I thought that the > primary reason for the call is to guarantee a forward progress in cases > where the userspace OOM victim cannot die on SIGKILL. That should be > more an exception than a normal case, no? > I am thinking of using this API in this way: On user-defined OOM condition, kill a job/cgroup and unconditionally reap all of its processes. Keep monitoring the situation and if it does not improve go for another kill and reap. I can add additional logic in between kill and reap to see if reap is necessary but unconditionally reaping is more simple. > > > An alternative would be to have a cgroup specific interface for > > reaping similar to cgroup.kill. > > Could you elaborate? > I mentioned this in [1] where I was thinking if it makes sense to overload cgroup.kill to also add the SIGKILLed processes in oom_reaper_list. The downside would be that there will be one thread doing the reaping and the syscall approach allows userspace to reap in multiple threads. I think for now, I would go with whatever Suren is proposing and we can always add more stuff if need arises. [1] https://lore.kernel.org/containers/CALvZod4jsb6bFzTOS4ZRAJGAzBru0oWanAhezToprjACfGm+ew@xxxxxxxxxxxxxx/