On Tue 24-10-17 11:45:11, Johannes Weiner wrote: > On Fri, Oct 13, 2017 at 08:35:55AM +0200, Michal Hocko wrote: > > On Thu 12-10-17 15:03:12, Johannes Weiner wrote: > > > All I'm saying is that, when the syscall-context fails to charge, we > > > should do mem_cgroup_oom() to set up the async OOM killer, let the > > > charge succeed over the hard limit - since the OOM killer will most > > > likely get us back below the limit - then mem_cgroup_oom_synchronize() > > > before the syscall returns to userspace. > > > > OK, then we are on the same page now. Your initial wording didn't > > mention async OOM killer. This makes more sense. Although I would argue > > that we can retry the charge as long as out_of_memory finds a victim. > > This would return ENOMEM to the pathological cases where no victims > > could be found. > > I think that's much worse because it's even harder to test and verify > your applications against. Well, the main distinction to the global OOM killer is that we panic when there is no oom victim eligible which we cannot do in the memcg context. So we have to bail somehow and I would be really careful to allow for a runaway from the hard limit just because we are out of all eligible tasks. Returning ENOMEM sounds like a safer option to me. > If syscalls can return -ENOMEM on OOM, they should do so reliably. The main problem is that we do not know which syscalls can return ENOMEM -- Michal Hocko SUSE Labs