Re: [PATCH] SUNRPC: Fix a race to wake a sync task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16 Jul 2024, at 9:23, Trond Myklebust wrote:

> On Fri, 2024-07-12 at 09:55 -0400, Benjamin Coddington wrote:
>> We've observed NFS clients with sync tasks sleeping in __rpc_execute
>> waiting on RPC_TASK_QUEUED that have not responded to a wake-up from
>> rpc_make_runnable().  I suspect this problem usually goes unnoticed,
>> because on a busy client the task will eventually be re-awoken by
>> another
>> task completion or xprt event.  However, if the state manager is
>> draining
>> the slot table, a sync task missing a wake-up can result in a hung
>> client.
>>
>> We've been able to prove that the waker in rpc_make_runnable()
>> successfully
>> calls wake_up_bit() (ie- there's no race to tk_runstate), but the
>> wake_up_bit() call fails to wake the waiter.  I suspect the waker is
>> missing the load of the bit's wait_queue_head, so waitqueue_active()
>> is
>> false.  There are some very helpful comments about this problem above
>> wake_up_bit(), prepare_to_wait(), and waitqueue_active().
>>
>> Fix this by inserting smp_mb() before the wake_up_bit(), which pairs
>> with
>> prepare_to_wait() calling set_current_state().
>>
>> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx>
>> ---
>>  net/sunrpc/sched.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> index 6debf4fd42d4..34b31be75497 100644
>> --- a/net/sunrpc/sched.c
>> +++ b/net/sunrpc/sched.c
>> @@ -369,8 +369,11 @@ static void rpc_make_runnable(struct
>> workqueue_struct *wq,
>>  	if (RPC_IS_ASYNC(task)) {
>>  		INIT_WORK(&task->u.tk_work, rpc_async_schedule);
>>  		queue_work(wq, &task->u.tk_work);
>> -	} else
>> +	} else {
>> +		/* paired with set_current_state() in
>> prepare_to_wait */
>> +		smp_mb();
>
> Hmm... Why isn't it sufficient to use smp_mb__after_atomic() here?
> That's what clear_and_wake_up_bit() uses in this case.

Great question, one that I can't answer with confidence (which is why I
solicited advice under the RFC posting).

I ended up following the guidance in the comment above wake_up_bit(), since
we clear RPC_TASK_RUNNING non-atomically within the queue's spinlock.  It
states that we'd probably need smp_mb().

However - this race isn't to tk_runstate, its actually to waitqueue_active()
being true/false.  So - I believe unless we're checking the bit in the
waiter's wait_bit_action function, we really only want to look at the race
in the terms of making sure the waiter's setting of the wait_queue_head is
visible after the waker tests_and_sets RPC_TASK_RUNNING.

Ben





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux