On 5/18/23 04:34, Bart Van Assche wrote:
On 5/17/23 18:16, Ming Lei wrote:
On Wed, May 17, 2023 at 04:09:27PM -0700, Bart Van Assche wrote:
@@ -1767,7 +1767,7 @@ static blk_status_t scsi_queue_rq(struct
blk_mq_hw_ctx *hctx,
break;
case BLK_STS_RESOURCE:
case BLK_STS_ZONE_RESOURCE:
- if (scsi_device_blocked(sdev))
+ if (scsi_device_blocked(sdev) || shost->host_self_blocked)
ret = BLK_STS_DEV_RESOURCE;
What if scsi_unblock_requests() is just called after the above check and
before returning to block layer core? Then this request is invisible to
scsi_run_host_queues()<-scsi_unblock_requests(), and io hang happens.
If returning BLK_STS_DEV_RESOURCE could cause an I/O hang, wouldn't that
be a bug in the block layer core? Isn't the block layer core expected to
rerun the queue after a delay if a block driver returns
BLK_STS_DEV_RESOURCE? See also blk_mq_dispatch_rq_list().
DEV_RESOURCE is a tricky thing; it actually implies that the _device_ is
blocked, and no further I/O is possible.
And it's actually the responsibility of the device/driver to kick the
queue again once the contention/block/whatever is done.
So no, the block layer should not do anything here, relying on the
driver to do something.
Which is why returning DEV_RESOURCE is not recommended, seeing that it's
easy to get it wrong...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman