On Thu 05-09-13 07:54:30, Johannes Weiner wrote: [...] > From: Johannes Weiner <hannes@xxxxxxxxxxx> > Subject: [patch] mm: memcg: handle non-error OOM situations more gracefully > > Many places that can trigger a memcg OOM situation return gracefully > and don't propagate VM_FAULT_OOM up the fault stack. > > It's not practical to annotate all of them to disable the memcg OOM > killer. Instead, just clean up any set OOM state without warning in > case the fault is not returning VM_FAULT_OOM. > > Also fail charges immediately when the current task already is in an > OOM context. Otherwise, the previous context gets overwritten and the > memcg reference is leaked. Could you paste find_or_create_page called from __get_blk as an example here, please? So that we do not have to scratch our heads again later... Also task_in_memcg_oom could be stuffed into mem_cgroup_disable_oom branch to reduce an overhead for in-kernel faults. The overhead shouldn't be noticeable so I am not sure this is that important. > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> I do not see any easier way to fix this without returning back to the old behavior which is much worse. Acked-by: Michal Hocko <mhocko@xxxxxxx> Thanks! > diff --git a/mm/memory.c b/mm/memory.c > index cdbe41b..cdad471 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -57,7 +57,6 @@ > #include <linux/swapops.h> > #include <linux/elf.h> > #include <linux/gfp.h> > -#include <linux/stacktrace.h> > > #include <asm/io.h> > #include <asm/pgalloc.h> > @@ -3521,11 +3520,8 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, > if (flags & FAULT_FLAG_USER) > mem_cgroup_disable_oom(); > > - if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))) { > - printk("Fixing unhandled memcg OOM context set up from:\n"); > - print_stack_trace(¤t->memcg_oom.trace, 0); > - mem_cgroup_oom_synchronize(); > - } > + if (task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM)) > + mem_cgroup_oom_synchronize(false); > > return ret; > } > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index aa60863..3bf664c 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -785,7 +785,7 @@ out: > */ > void pagefault_out_of_memory(void) > { > - if (mem_cgroup_oom_synchronize()) > + if (mem_cgroup_oom_synchronize(true)) > return; > if (try_set_system_oom()) { > out_of_memory(NULL, 0, 0, NULL); > -- > 1.8.4 > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html