On Mon, 2018-01-22 at 11:35 +0800, Ming Lei wrote: > @@ -1280,10 +1282,18 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, > * - Some but not all block drivers stop a queue before > * returning BLK_STS_RESOURCE. Two exceptions are scsi-mq > * and dm-rq. > + * > + * If drivers return BLK_STS_RESOURCE and S_SCHED_RESTART > + * bit is set, run queue after 10ms for avoiding IO hang > + * because the queue may be idle and the RESTART mechanism > + * can't work any more. > */ > - if (!blk_mq_sched_needs_restart(hctx) || > + needs_restart = blk_mq_sched_needs_restart(hctx); > + if (!needs_restart || > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > blk_mq_run_hw_queue(hctx, true); > + else if (needs_restart && (ret == BLK_STS_RESOURCE)) > + blk_mq_delay_run_hw_queue(hctx, 10); > } In my opinion there are two problems with the above changes: * Only the block driver author can know what a good choice is for the time after which to rerun the queue. So I think moving the rerun delay (10 ms) constant from block drivers into the core is a step backwards instead of a step forwards. * The purpose of the BLK_MQ_S_SCHED_RESTART flag is to detect whether or not any of the queue runs triggered by freeing a tag happened concurrently. I don't think that there is any relationship between queue runs happening all or not concurrently and the chance that driver resources become available. So deciding whether or not a queue should be rerun based on the value of the BLK_MQ_S_SCHED_RESTART flag seems wrong to me. > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index d9ca1dfab154..55be2550c555 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2030,9 +2030,9 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, > case BLK_STS_OK: > break; > case BLK_STS_RESOURCE: > - if (atomic_read(&sdev->device_busy) == 0 && > - !scsi_device_blocked(sdev)) > - blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY); > + if (atomic_read(&sdev->device_busy) || > + scsi_device_blocked(sdev)) > + ret = BLK_STS_DEV_RESOURCE; > break; > default: > /* The above introduces two changes that have not been mentioned in the description of this patch: - The queue rerunning delay is changed from 3 ms into 10 ms. Where is the explanation of this change? Does this change have a positive or negative performance impact? - The above modifies a guaranteed queue rerun into a queue rerun that may or may not happen, depending on whether or not multiple tags get freed concurrently (return BLK_STS_DEV_RESOURCE). Sorry but I think that's wrong. Bart.