On Tue, May 29, 2012 at 09:15:00AM +0800, Asias He wrote: > After hot-unplug a stressed disk, I found that rl->wait[] is not empty > while rl->count[] is empty and there are theads still sleeping on > get_request after the queue cleanup. With simple debug code, I found > there are exactly nr_sleep - nr_wakeup of theads in D state. So there > are missed wakeup. > > $ dmesg | grep nr_sleep > [ 52.917115] ---> nr_sleep=1046, nr_wakeup=873, delta=173 > $ vmstat 1 > 1 173 0 712640 24292 96172 0 0 0 0 419 757 0 0 0 100 0 > > To quote Tejun: > > Ah, okay, freed_request() wakes up single waiter with the assumption > that after the wakeup there will at least be one successful allocation > which in turn will continue the wakeup chain until the wait list is > empty - ie. waiter wakeup is dependent on successful request > allocation happening after each wakeup. With queue marked dead, any > woken up waiter fails the allocation path, so the wakeup chaining is > lost and we're left with hung waiters. What we need is wake_up_all() > after drain completion. > > This patch fixes the missed wakeup by waking up all the theads which > are sleeping on wait queue after queue drain. > > Changes in v2: Drop waitqueue_active() optimization > > Signed-off-by: Asias He <asias@xxxxxxxxxx> Acked-by: Tejun Heo <tj@xxxxxxxxxx> Jens, this one wants Cc: stable. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html