On 9/11/21 09:47, Adrian Hunter wrote:
On 8/09/21 1:36 am, Bart Van Assche wrote:
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2707,6 +2707,14 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
}
fallthrough;
case UFSHCD_STATE_RESET:
+ /*
+ * The SCSI error handler only starts after all pending commands
+ * have failed or timed out. Complete commands with
+ * DID_IMM_RETRY to allow the error handler to start
+ * if it has been scheduled.
+ */
+ set_host_byte(cmd, DID_IMM_RETRY);
+ cmd->scsi_done(cmd);
Setting non-zero return value, in this case "err = SCSI_MLQUEUE_HOST_BUSY"
will anyway cause scsi_dec_host_busy(), so does this make any difference?
The return value should be changed into 0 since returning
SCSI_MLQUEUE_HOST_BUSY is only allowed if cmd->scsi_done(cmd) has not
yet been called.
I expect that setting the host byte to DID_IMM_RETRY and calling
scsi_done will make a difference, otherwise I wouldn't have suggested
this. As explained in my previous email doing that triggers the SCSI
command completion and resubmission paths. Resubmission only happens if
the SCSI error handler has not yet been scheduled. The SCSI error
handler is scheduled after for all pending commands scsi_done() has been
called or a timeout occurred. In other words, setting the host byte to
DID_IMM_RETRY and calling scsi_done() makes it possible for the error
handler to be scheduled, something that won't happen if
ufshcd_queuecommand() systematically returns SCSI_MLQUEUE_HOST_BUSY. In
the latter case the block layer timer is reset over and over again. See
also the blk_mq_start_request() in scsi_queue_rq(). One could wonder
whether this is really what the SCSI core should do if a SCSI LLD keeps
returning the SCSI_MLQUEUE_HOST_BUSY status code ...
Bart.