On Fri, 2024-07-12 at 09:55 -0400, Benjamin Coddington wrote: > We've observed NFS clients with sync tasks sleeping in __rpc_execute > waiting on RPC_TASK_QUEUED that have not responded to a wake-up from > rpc_make_runnable(). I suspect this problem usually goes unnoticed, > because on a busy client the task will eventually be re-awoken by > another > task completion or xprt event. However, if the state manager is > draining > the slot table, a sync task missing a wake-up can result in a hung > client. > > We've been able to prove that the waker in rpc_make_runnable() > successfully > calls wake_up_bit() (ie- there's no race to tk_runstate), but the > wake_up_bit() call fails to wake the waiter. I suspect the waker is > missing the load of the bit's wait_queue_head, so waitqueue_active() > is > false. There are some very helpful comments about this problem above > wake_up_bit(), prepare_to_wait(), and waitqueue_active(). > > Fix this by inserting smp_mb() before the wake_up_bit(), which pairs > with > prepare_to_wait() calling set_current_state(). > > Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx> > --- > net/sunrpc/sched.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c > index 6debf4fd42d4..34b31be75497 100644 > --- a/net/sunrpc/sched.c > +++ b/net/sunrpc/sched.c > @@ -369,8 +369,11 @@ static void rpc_make_runnable(struct > workqueue_struct *wq, > if (RPC_IS_ASYNC(task)) { > INIT_WORK(&task->u.tk_work, rpc_async_schedule); > queue_work(wq, &task->u.tk_work); > - } else > + } else { > + /* paired with set_current_state() in > prepare_to_wait */ > + smp_mb(); Hmm... Why isn't it sufficient to use smp_mb__after_atomic() here? That's what clear_and_wake_up_bit() uses in this case. > wake_up_bit(&task->tk_runstate, RPC_TASK_QUEUED); > + } > } > > /* -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx