Re: [PATCH] oom: add pending SIGKILL check for chosen victim

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/23, Michal Hocko wrote:
>
> [CCing Oleg]
>
> On Tue 23-04-13 19:26:14, dserrg wrote:
> > On Mon, 22 Apr 2013 21:51:38 +0200
> >
> > Yes, we are holding tasklist_lock when iterating, but the thread can be deleted
> > from thread_group list _before_ that. In this case, while_each_thread loop exit
> > condition will never be true.

Yes, while_each_thread() should be only used if know that ->thread_group
list is valid.

For example, if you do find_task_by_vpid() under rcu_read_lock()
while_each_thread() is fine _unless_ you drop rcu lock in between.

This is the common mistake, people often forget about this.

But I can't understand how this patch can fix the problem, I think it
can't.

>From the changelog:

	When SIGKILL is sent to a task, it's also sent to all tasks in the same
	threadgroup. This information can be used to prevent triggering further
	oom killers for this threadgroup and avoid the infinite loop.
                                             ^^^^^^^^^^^^^^^^^^^^^^^

How??


> Oleg, is there anything that would prevent from this race? Maybe we need
> to call thread_group_empty before?

You need to check, say, pid_alive(). Or PF_EXITING.

For example oom_kill_process() looks wrong. It does check PF_EXITING
before while_each_thread(), but this is racy because it should check
it under tasklist_lock.

So I think that oom_kill_process() needs something like below, but
this code really needs the cleanups.

Oleg.


--- x/mm/oom_kill.c
+++ x/mm/oom_kill.c
@@ -436,6 +436,14 @@ void oom_kill_process(struct task_struct
 	 * parent.  This attempts to lose the minimal amount of work done while
 	 * still freeing memory.
 	 */
+	rcu_read_lock();
+	if (!pid_alive(p)) {
+		rcu_read_unlock();
+		set_tsk_thread_flag(p, TIF_MEMDIE);
+		put_task_struct(p);
+		return;
+	}
+
 	read_lock(&tasklist_lock);
 	do {
 		list_for_each_entry(child, &t->children, sibling) {
@@ -458,7 +466,6 @@ void oom_kill_process(struct task_struct
 	} while_each_thread(p, t);
 	read_unlock(&tasklist_lock);
 
-	rcu_read_lock();
 	p = find_lock_task_mm(victim);
 	if (!p) {
 		rcu_read_unlock();

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]