On Mon, 16 Aug 2010 18:15:57 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote: > The oom killer's goal is to kill a memory-hogging task so that it may > exit, free its memory, and allow the current context to allocate the > memory that triggered it in the first place. Thus, killing a task is > pointless if other threads sharing its mm cannot be killed because of its > /proc/pid/oom_adj or /proc/pid/oom_score_adj value. > > This patch checks all user threads on the system to determine whether > oom_badness(p) should return 0 for p, which means it should not be killed. > If a thread shares p's mm and is unkillable, p is considered to be > unkillable as well. > > Kthreads are not considered toward this rule since they only temporarily > assume a task's mm via use_mm(). > > Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Thank you. BTW, do you have good idea for speed-up ? This code seems terribly slow when a system has many processes. > --- > v2: change do_each_thread() to for_each_process() as suggested by Oleg. > > It's actually not possible to move this logic to oom_kill_task() because > it's racy: oom_badness() is not a constant score and depends on the state > of the VM when it is called. This leads to unnecessarily panicking the > machine in that case as wel as when the same child to sacrifice is > repeatedly selected in oom_kill_process() based on the parent's badness > score. > > mm/oom_kill.c | 28 +++++++++++++++++++++------- > 1 files changed, 21 insertions(+), 7 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -83,6 +83,25 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk, > #endif /* CONFIG_NUMA */ > > /* > + * Determines whether an mm is unfreeable since a user thread attached to > + * it cannot be killed. Kthreads only temporarily assume a thread's mm, > + * so they are not considered. > + * > + * mm need not be protected by task_lock() since it will not be > + * dereferened. > + */ > +static bool is_mm_unfreeable(struct mm_struct *mm) > +{ > + struct task_struct *p; > + > + for_each_process(p) > + if (p->mm == mm && !(p->flags & PF_KTHREAD) && > + p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) > + return true; > + return false; > +} > + > +/* > * If this is a system OOM (not a memcg OOM) and the task selected to be > * killed is not already running at high (RT) priorities, speed up the > * recovery by boosting the dying task to the lowest FIFO priority. > @@ -160,12 +179,7 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > p = find_lock_task_mm(p); > if (!p) > return 0; > - > - /* > - * Shortcut check for OOM_SCORE_ADJ_MIN so the entire heuristic doesn't > - * need to be executed for something that cannot be killed. > - */ > - if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) { > + if (is_mm_unfreeable(p->mm)) { > task_unlock(p); > return 0; > } > @@ -675,7 +689,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, > read_lock(&tasklist_lock); > if (sysctl_oom_kill_allocating_task && > !oom_unkillable_task(current, NULL, nodemask) && > - (current->signal->oom_adj != OOM_DISABLE)) { > + !is_mm_unfreeable(current->mm)) { > /* > * oom_kill_process() needs tasklist_lock held. If it returns > * non-zero, current could not be killed so we must fallback to > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>