On Mon, Sep 09, 2013 at 02:56:59PM +0200, Michal Hocko wrote: > [Adding Glauber - the full patch is here https://lkml.org/lkml/2013/9/5/319] > > On Mon 09-09-13 14:36:25, Michal Hocko wrote: > > On Thu 05-09-13 12:18:17, Johannes Weiner wrote: > > [...] > > > From: Johannes Weiner <hannes@xxxxxxxxxxx> > > > Subject: [patch] mm: memcg: do not trap chargers with full callstack on OOM > > > > > [...] > > > > > > To fix this, never do any OOM handling directly in the charge context. > > > When an OOM situation is detected, let the task remember the memcg and > > > then handle the OOM (kill or wait) only after the page fault stack is > > > unwound and about to return to userspace. > > > > OK, this is indeed nicer because the oom setup is trivial and the > > handling is not split into two parts and everything happens close to > > out_of_memory where it is expected. > > Hmm, wait a second. I have completely forgot about the kmem charging > path during the review. > > So while previously memcg_charge_kmem could have oom killed a > task if the it couldn't charge to the u-limit after it managed > to charge k-limit, now it would simply fail because there is no > mem_cgroup_{enable,disable}_oom around __mem_cgroup_try_charge it relies > on. The allocation will fail in the end but I am not sure whether the > missing oom is an issue or not for existing use cases. Kernel sites should be able to handle -ENOMEM, right? And if this nests inside a userspace fault, it'll still enter OOM. > My original objection about oom triggered from kmem paths was that oom > is not kmem aware so the oom decisions might be totally bogus. But we > still have that: Well, k should be a fraction of u+k on any reasonable setup, so there are always appropriate candidates to take down. > /* > * Conditions under which we can wait for the oom_killer. Those are > * the same conditions tested by the core page allocator > */ > may_oom = (gfp & __GFP_FS) && !(gfp & __GFP_NORETRY); > > _memcg = memcg; > ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT, > &_memcg, may_oom); > > I do not mind having may_oom = false unconditionally in that path but I > would like to hear fromm Glauber first. The patch I just sent to azur puts this conditional into try_charge(), so I'd just change the kmem site to pass `true'. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html