On Mon, Jul 31, 2023 at 04:48:33PM +1000, NeilBrown wrote: > svc_get_next_xprt() does a lot more than just get an xprt. It also > decides if it needs to sleep, depending not only on the availability of > xprts, but also on the need to exit or handle external work > (SP_TASK_PENDING). > > So rename it to svc_rqst_wait_and_dequeue_work(), don't return the xprt > (which can easily be found in rqstp->rq_xprt), and restructure to make a > clear separation between waiting and dequeueing. For me, the most valuable part of this patch is the last part here: refactoring the dequeue and the wait, and deduplicating the dequeue. > All the scheduling-related code like try_to_freeze() and > kthread_should_stop() is moved into svc_rqst_wait_and_dequeue_work(). > > Rather than calling svc_xprt_dequeue() twice (before and after deciding > to wait), it now calls rqst_should_sleep() twice. If the first fails, > we skip all the waiting code completely. In the waiting code we call > again after setting the task state in case we missed a wake-up. > > We now only have one call to try_to_freeze() and one call to > svc_xprt_dequeue(). We still have two calls to kthread_should_stop() - > one in rqst_should_sleep() to avoid sleeping, and one afterwards to > avoid dequeueing any work (it previously came after dequeueing which > doesn't seem right). > > Signed-off-by: NeilBrown <neilb@xxxxxxx> > --- > net/sunrpc/svc_xprt.c | 62 +++++++++++++++++++++---------------------- > 1 file changed, 31 insertions(+), 31 deletions(-) > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > index 380fb3caea4c..67f2b34cb8e4 100644 > --- a/net/sunrpc/svc_xprt.c > +++ b/net/sunrpc/svc_xprt.c > @@ -722,47 +722,51 @@ rqst_should_sleep(struct svc_rqst *rqstp) > return true; > } > > -static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp) > +static void svc_rqst_wait_and_dequeue_work(struct svc_rqst *rqstp) It would be simpler to follow if you renamed this function once (here), and changed directly from returning struct svc_xprt to returning bool. > { > struct svc_pool *pool = rqstp->rq_pool; > + bool slept = false; > > /* rq_xprt should be clear on entry */ > WARN_ON_ONCE(rqstp->rq_xprt); > > - rqstp->rq_xprt = svc_xprt_dequeue(pool); > - if (rqstp->rq_xprt) { > - trace_svc_pool_polled(rqstp); > - goto out_found; > + if (rqst_should_sleep(rqstp)) { > + set_current_state(TASK_IDLE); > + smp_mb__before_atomic(); > + clear_bit(SP_CONGESTED, &pool->sp_flags); > + clear_bit(RQ_BUSY, &rqstp->rq_flags); > + smp_mb__after_atomic(); > + > + /* Need to test again after setting task state */ This comment isn't illuminating. It needs to explain the "need to test again". > + if (likely(rqst_should_sleep(rqstp))) { Is likely() still needed here? > + schedule(); > + slept = true; > + } else { > + __set_current_state(TASK_RUNNING); > + cond_resched(); This makes me happy. Only call cond_resched() if we didn't sleep. > + } > + set_bit(RQ_BUSY, &rqstp->rq_flags); > + smp_mb__after_atomic(); > } > - > - set_current_state(TASK_IDLE); > - smp_mb__before_atomic(); > - clear_bit(SP_CONGESTED, &pool->sp_flags); > - clear_bit(RQ_BUSY, &rqstp->rq_flags); > - smp_mb__after_atomic(); > - > - if (likely(rqst_should_sleep(rqstp))) > - schedule(); > - else > - __set_current_state(TASK_RUNNING); > - > try_to_freeze(); > > - set_bit(RQ_BUSY, &rqstp->rq_flags); > - smp_mb__after_atomic(); > + if (kthread_should_stop()) > + return; > + > clear_bit(SP_TASK_PENDING, &pool->sp_flags); > rqstp->rq_xprt = svc_xprt_dequeue(pool); > if (rqstp->rq_xprt) { > - trace_svc_pool_awoken(rqstp); > + if (slept) > + trace_svc_pool_awoken(rqstp); > + else > + trace_svc_pool_polled(rqstp); Again, it would perhaps be better if we rearranged this code first, and then added tracepoints. This is ... well, ugly. > goto out_found; > } > > - if (kthread_should_stop()) > - return NULL; > - percpu_counter_inc(&pool->sp_threads_no_work); > - return NULL; > + if (slept) > + percpu_counter_inc(&pool->sp_threads_no_work); > + return; > out_found: > - clear_bit(SP_TASK_PENDING, &pool->sp_flags); > /* Normally we will wait up to 5 seconds for any required > * cache information to be provided. > */ > @@ -770,7 +774,6 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp) > rqstp->rq_chandle.thread_wait = 5*HZ; > else > rqstp->rq_chandle.thread_wait = 1*HZ; > - return rqstp->rq_xprt; > } > > static void svc_add_new_temp_xprt(struct svc_serv *serv, struct svc_xprt *newxpt) > @@ -854,12 +857,9 @@ void svc_recv(struct svc_rqst *rqstp) > if (!svc_alloc_arg(rqstp)) > goto out; > > - try_to_freeze(); > - cond_resched(); > - if (kthread_should_stop()) > - goto out; > + svc_rqst_wait_and_dequeue_work(rqstp); > > - xprt = svc_get_next_xprt(rqstp); > + xprt = rqstp->rq_xprt; > if (!xprt) > goto out; > > -- > 2.40.1 > -- Chuck Lever