On 04/23, Michal Hocko wrote: > > [CCing Oleg] > > On Tue 23-04-13 19:26:14, dserrg wrote: > > On Mon, 22 Apr 2013 21:51:38 +0200 > > > > Yes, we are holding tasklist_lock when iterating, but the thread can be deleted > > from thread_group list _before_ that. In this case, while_each_thread loop exit > > condition will never be true. Yes, while_each_thread() should be only used if know that ->thread_group list is valid. For example, if you do find_task_by_vpid() under rcu_read_lock() while_each_thread() is fine _unless_ you drop rcu lock in between. This is the common mistake, people often forget about this. But I can't understand how this patch can fix the problem, I think it can't. >From the changelog: When SIGKILL is sent to a task, it's also sent to all tasks in the same threadgroup. This information can be used to prevent triggering further oom killers for this threadgroup and avoid the infinite loop. ^^^^^^^^^^^^^^^^^^^^^^^ How?? > Oleg, is there anything that would prevent from this race? Maybe we need > to call thread_group_empty before? You need to check, say, pid_alive(). Or PF_EXITING. For example oom_kill_process() looks wrong. It does check PF_EXITING before while_each_thread(), but this is racy because it should check it under tasklist_lock. So I think that oom_kill_process() needs something like below, but this code really needs the cleanups. Oleg. --- x/mm/oom_kill.c +++ x/mm/oom_kill.c @@ -436,6 +436,14 @@ void oom_kill_process(struct task_struct * parent. This attempts to lose the minimal amount of work done while * still freeing memory. */ + rcu_read_lock(); + if (!pid_alive(p)) { + rcu_read_unlock(); + set_tsk_thread_flag(p, TIF_MEMDIE); + put_task_struct(p); + return; + } + read_lock(&tasklist_lock); do { list_for_each_entry(child, &t->children, sibling) { @@ -458,7 +466,6 @@ void oom_kill_process(struct task_struct } while_each_thread(p, t); read_unlock(&tasklist_lock); - rcu_read_lock(); p = find_lock_task_mm(victim); if (!p) { rcu_read_unlock(); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>