On Thu, 9 May 2019 13:02:11 -0500 minyard@xxxxxxx wrote: > From: Corey Minyard <cminyard@xxxxxxxxxx> > > The function call do_wait_for_common() has a race condition that > can result in lockups waiting for completions. Adding the thread > to (and removing the thread from) the wait queue for the completion > is done outside the do loop in that function. However, if the thread > is woken up, the swake_up_locked() function will delete the entry > from the wait queue. If that happens and another thread sneaks > in and decrements the done count in the completion to zero, the > loop will go around again, but the thread will no longer be in the > wait queue, so there is no way to wake it up. > > Visually, here's a diagram from Sebastian Andrzej Siewior: > T0 T1 T2 > wait_for_completion() > do_wait_for_common() > __prepare_to_swait() > schedule() > complete() > x->done++ (0 -> 1) > raw_spin_lock_irqsave() > swake_up_locked() wait_for_completion() > wake_up_process(T0) > list_del_init() > raw_spin_unlock_irqrestore() > raw_spin_lock_irq(&x->wait.lock) > raw_spin_lock_irq(&x->wait.lock) x->done != UINT_MAX, 1 -> 0 > raw_spin_unlock_irq(&x->wait.lock) > return 1 > while (!x->done && timeout), > continue loop, not enqueued > on &x->wait > > Basically, the problem is that the original wait queues used in > completions did not remove the item from the queue in the wakeup > function, but swake_up_locked() does. > > Fix it by adding/removing the thread to/from the wait queue inside > the do loop. > > Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues") > Signed-off-by: Corey Minyard <cminyard@xxxxxxxxxx> > --- > This looks like a fairly serious bug, I guess, but I've never seen a > report on it before. > > I found it because I have an out-of-tree feature (hopefully in tree some > day) that takes a core dump of a running process without killing it. It > makes extensive use of completions, and the test code is fairly brutal. > It didn't lock up on stock 4.19, but failed with the RT patches applied. > > The funny thing is, I've never seen this test code fail before on earlier > releases, but it locks up pretty reliably on 4.19-rt. It looks like this > bug goes back to at least the 4.4-rt kernel. But we haven't received any > customer reports of failures. I'm guessing that almost all completion > users have a single waiter, so you would never see this. > > The feature and test are in a public tree if someone wants to try to > reproduce this. But hopefully this is pretty obvious with the explaination. > > Also, you could put the DECLARE_SWAITQUEUE() outside the loop, I think, > but maybe it's cleaner or safer to declare it in the loop? If someone > cares I can test it that way. > > -corey > > kernel/sched/completion.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c > index 755a58084978..4f9b4cc0c95a 100644 > --- a/kernel/sched/completion.c > +++ b/kernel/sched/completion.c > @@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x, > long (*action)(long), long timeout, int state) > { > if (!x->done) { > - DECLARE_SWAITQUEUE(wait); > - > - __prepare_to_swait(&x->wait, &wait); > do { > + DECLARE_SWAITQUEUE(wait); > + > if (signal_pending_state(state, current)) { > timeout = -ERESTARTSYS; > break; > } > + __prepare_to_swait(&x->wait, &wait); Sebastian mentioned that moving the __prepare_to_swait() was the only change needed. Please do that. Thanks! -- Steve > __set_current_state(state); > raw_spin_unlock_irq(&x->wait.lock); > timeout = action(timeout); > raw_spin_lock_irq(&x->wait.lock); > + __finish_swait(&x->wait, &wait); > } while (!x->done && timeout); > - __finish_swait(&x->wait, &wait); > if (!x->done) > return timeout; > }