On Tue, 3 Dec 2013, Michal Hocko wrote: > OK, as it seems that the notification part is too controversial, how > would you like the following? It reverts the notification part and still > solves the fault on exit path. I will prepare the full patch with the > changelog if this looks reasonable: Um, no, that's not satisfactory because it obviously does the check after mem_cgroup_oom_notify(). There is absolutely no reason why userspace should be woken up when current simply needs access to memory reserves to exit. You can already get such notification by memory thresholds at the memcg limit. I'll repeat: Section 10 of Documentation/cgroups/memory.txt specifies what userspace should do when waking up; one of those options is not "check if the memcg is still actually oom in a short period of time once a charging task with a pending SIGKILL or in the exit path has been able to exit." Users of this interface typically also disable the memcg oom killer through the same file, it's ludicrous to put the responsibility on userspace to determine if the wakeup is actionable and requires it to intervene in one of the methods listed in section 10. > --- > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 28c9221b74ea..f44fe7e65a98 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1783,6 +1783,16 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, > unsigned int points = 0; > struct task_struct *chosen = NULL; > > + /* > + * If current has a pending SIGKILL or is exiting, then automatically > + * select it. The goal is to allow it to allocate so that it may > + * quickly exit and free its memory. > + */ > + if (fatal_signal_pending(current) || current->flags & PF_EXITING) { > + set_thread_flag(TIF_MEMDIE); > + goto cleanup; > + } > + > check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL); > totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1; > for_each_mem_cgroup_tree(iter, memcg) { > @@ -2233,16 +2243,6 @@ bool mem_cgroup_oom_synchronize(bool handle) > if (!handle) > goto cleanup; > > - /* > - * If current has a pending SIGKILL or is exiting, then automatically > - * select it. The goal is to allow it to allocate so that it may > - * quickly exit and free its memory. > - */ > - if (fatal_signal_pending(current) || current->flags & PF_EXITING) { > - set_thread_flag(TIF_MEMDIE); > - goto cleanup; > - } > - > owait.memcg = memcg; > owait.wait.flags = 0; > owait.wait.func = memcg_oom_wake_function; > @@ -2266,6 +2266,13 @@ bool mem_cgroup_oom_synchronize(bool handle) > schedule(); > mem_cgroup_unmark_under_oom(memcg); > finish_wait(&memcg_oom_waitq, &owait.wait); > + > + /* Userspace OOM handler cannot set TIF_MEMDIE to a target */ > + if (memcg->oom_kill_disable) { > + if ((fatal_signal_pending(current) || > + current->flags & PF_EXITING)) > + set_thread_flag(TIF_MEMDIE); > + } > } > > if (locked) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>