On 13/09/21 7:33 pm, Bart Van Assche wrote: > On 9/13/21 1:53 AM, Adrian Hunter wrote: >> scsi_dec_host_busy() is called for any non-zero return value like >> SCSI_MLQUEUE_HOST_BUSY: >> >> i.e. >> reason = scsi_dispatch_cmd(cmd); >> if (reason) { >> scsi_set_blocked(cmd, reason); >> ret = BLK_STS_RESOURCE; >> goto out_dec_host_busy; >> } >> >> return BLK_STS_OK; >> >> out_dec_host_busy: >> scsi_dec_host_busy(shost, cmd); >> >> And that will wake the error handler: >> >> static void scsi_dec_host_busy(struct Scsi_Host *shost, struct scsi_cmnd *cmd) >> { >> unsigned long flags; >> >> rcu_read_lock(); >> __clear_bit(SCMD_STATE_INFLIGHT, &cmd->state); >> if (unlikely(scsi_host_in_recovery(shost))) { >> spin_lock_irqsave(shost->host_lock, flags); >> if (shost->host_failed || shost->host_eh_scheduled) >> scsi_eh_wakeup(shost); >> spin_unlock_irqrestore(shost->host_lock, flags); >> } >> rcu_read_unlock(); >> } > > Returning SCSI_MLQUEUE_HOST_BUSY is not sufficient to wake up the SCSI > error handler because of the following test in scsi_error_handler(): > > shost->host_failed != scsi_host_busy(shost) SCSI_MLQUEUE_HOST_BUSY causes scsi_host_busy() to decrement by calling scsi_dec_host_busy() as described above, so the request is not being counted in that condition anymore. > > As I mentioned in a previous email, all pending commands must have failed > or timed out before the error handler is woken up. Returning > SCSI_MLQUEUE_HOST_BUSY from ufshcd_queuecommand() does not fail a command > and prevents it from timing out. Hence my suggestion to change > "return SCSI_MLQUEUE_HOST_BUSY" into set_host_byte(cmd, DID_IMM_RETRY) > followed by cmd->scsi_done(cmd). A possible alternative is to move the > blk_mq_start_request() call in the SCSI core such that the block layer > request timer is not reset if a SCSI LLD returns SCSI_MLQUEUE_HOST_BUSY. > > Bart.