Does this sound something that you would be interested in? I can spend som more time on it if it is worthwhile. On Fri 13-10-17 17:24:21, Michal Hocko wrote: > Well, it actually occured to me that this would trigger the global oom > killer in case no memcg specific victim can be found which is definitely > not something we would like to do. This should work better. I am not > sure we can trigger this corner case but we should cover it and it > actually doesn't make the code much worse. > --- > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d5f3a62887cf..7b370f070b82 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1528,26 +1528,40 @@ static void memcg_oom_recover(struct mem_cgroup *memcg) > > static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > { > - if (!current->memcg_may_oom) > - return; > /* > * We are in the middle of the charge context here, so we > * don't want to block when potentially sitting on a callstack > * that holds all kinds of filesystem and mm locks. > * > - * Also, the caller may handle a failed allocation gracefully > - * (like optional page cache readahead) and so an OOM killer > - * invocation might not even be necessary. > + * cgroup v1 allowes sync users space handling so we cannot afford > + * to get stuck here for that configuration. That's why we don't do > + * anything here except remember the OOM context and then deal with > + * it at the end of the page fault when the stack is unwound, the > + * locks are released, and when we know whether the fault was overall > + * successful. > + * > + * On the other hand, in-kernel OOM killer allows for an async victim > + * memory reclaim (oom_reaper) and that means that we are not solely > + * relying on the oom victim to make a forward progress so we can stay > + * in the the try_charge context and keep retrying as long as there > + * are oom victims to select. > * > - * That's why we don't do anything here except remember the > - * OOM context and then deal with it at the end of the page > - * fault when the stack is unwound, the locks are released, > - * and when we know whether the fault was overall successful. > + * Please note that mem_cgroup_oom_synchronize might fail to find a > + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > + * we would fall back to the global oom killer in pagefault_out_of_memory > */ > + if (!memcg->oom_kill_disable && > + mem_cgroup_out_of_memory(memcg, mask, order)) > + return true; > + > + if (!current->memcg_may_oom) > + return false; > css_get(&memcg->css); > current->memcg_in_oom = memcg; > current->memcg_oom_gfp_mask = mask; > current->memcg_oom_order = order; > + > + return false; > } > > /** > @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > mem_cgroup_event(mem_over_limit, MEMCG_OOM); > > - mem_cgroup_oom(mem_over_limit, gfp_mask, > - get_order(nr_pages * PAGE_SIZE)); > + if (mem_cgroup_oom(mem_over_limit, gfp_mask, > + get_order(nr_pages * PAGE_SIZE))) { > + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > + goto retry; > + } > nomem: > if (!(gfp_mask & __GFP_NOFAIL)) > return -ENOMEM; > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs