On 2020-08-17 03:08, Ming Lei wrote: > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 7c6dd6f75190..a62c29058d26 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -551,8 +551,27 @@ static void scsi_run_queue_async(struct scsi_device *sdev) > if (scsi_target(sdev)->single_lun || > !list_empty(&sdev->host->starved_list)) > kblockd_schedule_work(&sdev->requeue_work); > - else > - blk_mq_run_hw_queues(sdev->request_queue, true); > + else { Has this patch been verified with checkpatch? Checkpatch should have warned about the unbalanced braces. > + /* > + * smp_mb() implied in either rq->end_io or blk_mq_free_request > + * is for ordering writing .device_busy in scsi_device_unbusy() > + * and reading sdev->restarts. > + */ Hmm ... I don't see what orders the atomic_dec(&sdev->device_busy) from scsi_device_unbusy() and the atomic_read() below? I don't think that the block layer guarantees ordering of these two memory accesses since both accesses happen in the request completion path. > + int old = atomic_read(&sdev->restarts); > + > + if (old) { > + /* > + * ->restarts has to be kept as non-zero if there is > + * new budget contention comes. There are two verbs in the above sentence ("is" and "comes"). Please remove "comes" such that the sentence becomes grammatically correct. > + * > + * No need to run queue when either another re-run > + * queue wins in updating ->restarts or one new budget > + * contention comes. > + */ > + if (atomic_cmpxchg(&sdev->restarts, old, 0) == old) > + blk_mq_run_hw_queues(sdev->request_queue, true); > + } > + } Please combine the two if-statements into a single if-statement using "&&" to keep the indentation level low. > @@ -1611,8 +1630,34 @@ static void scsi_mq_put_budget(struct request_queue *q) > static bool scsi_mq_get_budget(struct request_queue *q) > { > struct scsi_device *sdev = q->queuedata; > + int ret = scsi_dev_queue_ready(q, sdev); > + > + if (ret) > + return true; > + > + atomic_inc(&sdev->restarts); > > - return scsi_dev_queue_ready(q, sdev); > + /* > + * Order writing .restarts and reading .device_busy, and make sure > + * .restarts is visible to scsi_end_request(). Its pair is implied by > + * __blk_mq_end_request() in scsi_end_request() for ordering > + * writing .device_busy in scsi_device_unbusy() and reading .restarts. > + * > + */ > + smp_mb__after_atomic(); Barriers do not guarantee "is visible to". Barriers enforce ordering of memory accesses performed by a certain CPU core. Did you perhaps mean that sdev->restarts must be incremented before the code below reads sdev->device busy? > + /* > + * If all in-flight requests originated from this LUN are completed > + * before setting .restarts, sdev->device_busy will be observed as > + * zero, then blk_mq_delay_run_hw_queues() will dispatch this request > + * soon. Otherwise, completion of one of these request will observe > + * the .restarts flag, and the request queue will be run for handling > + * this request, see scsi_end_request(). > + */ > + if (unlikely(atomic_read(&sdev->device_busy) == 0 && > + !scsi_device_blocked(sdev))) > + blk_mq_delay_run_hw_queues(sdev->request_queue, SCSI_QUEUE_DELAY); > + return false; > } Thanks, Bart.