On 01/19/18 07:24, Jens Axboe wrote:
That's what I thought. So for a low queue depth underlying queue, it's
quite possible that this situation can happen. Two potential solutions
I see:
1) As described earlier in this thread, having a mechanism for being
notified when the scarce resource becomes available. It would not
be hard to tap into the existing sbitmap wait queue for that.
2) Have dm set BLK_MQ_F_BLOCKING and just sleep on the resource
allocation. I haven't read the dm code to know if this is a
possibility or not.
I'd probably prefer #1. It's a classic case of trying to get the
request, and if it fails, add ourselves to the sbitmap tag wait
queue head, retry, and bail if that also fails. Connecting the
scarce resource and the consumer is the only way to really fix
this, without bogus arbitrary delays.
(replying to an e-mail from ten days ago)
Implementing a notification mechanism for all cases in which
blk_insert_cloned_request() returns BLK_STS_RESOURCE today would require
a lot of work. If e.g. a SCSI LLD returns one of the SCSI_MLQUEUE_*_BUSY
return codes from its .queuecommand() implementation then the SCSI core
will translate that return code into BLK_STS_RESOURCE. From scsi_queue_rq():
reason = scsi_dispatch_cmd(cmd);
if (reason) {
scsi_set_blocked(cmd, reason);
ret = BLK_STS_RESOURCE;
goto out_dec_host_busy;
}
In other words, implementing a notification mechanism for all cases in
which blk_insert_cloned_request() can return BLK_STS_RESOURCE would
require to modify all SCSI LLDs.
Bart.