On 01/28, Linus Torvalds wrote: > > On Sat, Jan 28, 2012 at 4:03 AM, tip-bot for Yasunori Goto > <y-goto@xxxxxxxxxxxxxx> wrote: > > > > sched: Fix ancient race in do_exit() > > Ugh. > > It would be much nicer to just clear the rwsem waiter->task thing > *after* waking the task up, which would avoid this race entirely, > afaik. How? The problem is that wake_up_process(tsk) sees this task in TASK_UNINTERRUPTIBLE state (the first "p->state & state" check in try_to_wake_up), but then this task changes its state to TASK_DEAD without schedule() and ttwu() does s/TASK_DEAD/TASK_RUNNING/. IOW, the task doing current->state = TASK_A; ... current->state = TASK_B; schedule(); can be woken up by try_to_wake_up(TASK_A), despite the fact it sleeps in TASK_B. do_exit() is only "special" because it is not easy to handle the spurious wakeup. > Tell me, why wouldn't that work? rwsem_down_failed_common() does > > /* wait to be given the lock */ > for (;;) { > if (!waiter.task) > break; > ... > > so then we wouldn't need the task refcount crap in rwsem either etc, > and we'd get rid of all races with wakeup. > > I wonder why we're clearing that whole waiter->task so early. I must have missed something. I can't understand how this can help, and "clear the rwsem waiter->task thing *after* waking" looks obviously wrong. If we do this, then we can miss the "!!waiter.task" condition. The loop above actually does set_task_state(TASK_UNINTERRUPTIBLE); if (!waiter.task) break; schedule(); and wake_up_process(tsk); waiter->task = NULL; can happen right after set_task_state(). Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html