On 6/28/21 11:23 PM, Can Guo wrote: > On 2021-06-29 01:31, Bart Van Assche wrote: >> On 6/28/21 1:17 AM, Can Guo wrote: >>> On 2021-06-25 01:11, Bart Van Assche wrote: >>>> On 6/23/21 11:31 PM, Can Guo wrote: >>>>> Using back host_sem in suspend_prepare()/resume_complete() >>>>> won't have this problem of deadlock, right? >>>> >>>> Although that would solve the deadlock discussed in this email >>>> thread, it wouldn't solve the issue of potential adverse >>>> interactions of the UFS error handler and the SCSI error >>>> handler running concurrently. >>> >>> I think I've explained it before, paste it here - >>> >>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and >>> flushes it, so SCSI error handler and UFS error handler can >>> safely run together. >> >> That code path is the exception. Do you agree that the following >> three functions all invoke the ufshcd_err_handler() function >> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() * >> ufshcd_abort() > > I agree, but I don't see what's wrong with that. Any context can > invoke ufs error handler asynchronously and ufs error handler prepare > makes sure error handler can work safely, i.e., stopping PM > ops/gating/scaling in error handler prepare makes sure no one shall > call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and > ufshcd_abort() are OK to run concurrently with UFS error handler. The current UFS error handling approach requires the following code in ufshcd_queuecommand(): if (hba->pm_op_in_progress) { hba->force_reset = true; set_host_byte(cmd, DID_BAD_TARGET); cmd->scsi_done(cmd); goto out; } Removing that code is not possible with the current error handling approach. My patch makes it possible to remove that code. > Sorry that I missed the change of scsi_transport_template() in your > previous message. I can understand that you want to invoke UFS error > hander by invoking SCSI error handler, but I didn't go that far > because I saw you changed pm_runtime_get_sync() to > pm_runtime_get_noresume() in ufs error handler prepare. How can that > change make sure that the device is not suspending or resuming while > error handler is running? UFS power state transitions happen by submitting a SCSI command to a WLUN. The SCSI error handler is only activated after all outstanding SCSI commands for a SCSI host have failed or completed. I think this guarantees for the UFS driver that eh_strategy_handler is not invoked while a command submitted to a WLUN is changing the power state of the UFS device. The following code from scsi_error.c only wakes up the error handler if (shost->host_failed || shost->host_eh_scheduled) && shost->host_failed == scsi_host_busy(shost): if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) || shost->host_failed != scsi_host_busy(shost)) { schedule(); continue; } /* Handle SCSI errors */ Thanks, Bart.