On 9/7/21 9:56 AM, Bart Van Assche wrote:
On 9/7/21 8:43 AM, Adrian Hunter wrote:
No. Requests cannot make progress when ufshcd_state is
UFSHCD_STATE_EH_SCHEDULED_FATAL, and only the error handler can change
that,
so if the error handler is waiting to enter the queue and
blk_mq_freeze_queue()
is waiting for outstanding requests, they will deadlock.
How about adding the above text as a comment above
ufshcd_clear_ua_wluns() such
that this information becomes available to those who have not followed this
conversation?
After having given patch 1/3 some further thought: an unfortunate
effect of this patch is that unit attention clearing is skipped for
the states UFSHCD_STATE_EH_SCHEDULED_FATAL and UFSHCD_STATE_RESET.
How about replacing patch 1/3 with the untested patch below since that
patch does not have the disadvantage of sometimes skipping clearing UA?
Thanks,
Bart.
[PATCH] scsi: ufs: Fix a recently introduced deadlock
Completing pending commands with DID_IMM_RETRY triggers the following
code paths:
scsi_complete()
-> scsi_queue_insert()
-> __scsi_queue_insert()
-> scsi_device_unbusy()
-> scsi_dec_host_busy()
-> scsi_eh_wakeup()
-> blk_mq_requeue_request()
scsi_queue_rq()
-> scsi_host_queue_ready()
-> scsi_host_in_recovery()
Fixes: a113eaaf8637 ("scsi: ufs: Synchronize SCSI and UFS error handling")
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
drivers/scsi/ufs/ufshcd.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index c2c614da1fb8..9560f34f3d27 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2707,6 +2707,14 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
}
fallthrough;
case UFSHCD_STATE_RESET:
+ /*
+ * The SCSI error handler only starts after all pending commands
+ * have failed or timed out. Complete commands with
+ * DID_IMM_RETRY to allow the error handler to start
+ * if it has been scheduled.
+ */
+ set_host_byte(cmd, DID_IMM_RETRY);
+ cmd->scsi_done(cmd);
err = SCSI_MLQUEUE_HOST_BUSY;
goto out;
case UFSHCD_STATE_ERROR: