On Mon 09-10-17 20:04:09, Michal Hocko wrote: > [CC Johannes - the thread starts > http://lkml.kernel.org/r/20171005222144.123797-1-shakeelb@xxxxxxxxxx] > > On Mon 09-10-17 10:52:44, Greg Thelen wrote: [...] > > A few ideas on how to make it more flexible: > > > > a) Go back to memcg oom killing within memcg charging. This runs risk > > of oom killing while caller holds locks which oom victim selection or > > oom victim termination may need. Google's been running this way for > > a while. We can actually reopen this discussion now that the oom handling is async due to the oom_reaper. At least for the v2 interface. I would have to think about it much more but the primary concern for this patch was whether we really need/want to charge short therm objects which do not outlive a single syscall. > > b) Have every syscall return do something similar to page fault handler: > > kmem allocations in oom memcg mark the current task as needing an oom > > check return NULL. If marked oom, syscall exit would use > > mem_cgroup_oom_synchronize() before retrying the syscall. Seems > > risky. I doubt every syscall is compatible with such a restart. yes, this is simply a no go > > c) Overcharge kmem to oom memcg and queue an async memcg limit checker, > > which will oom kill if needed. This is what we have max limit for. -- Michal Hocko SUSE Labs