On 5/24/21 1:36 AM, Can Guo wrote: > @@ -2688,6 +2705,43 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) > + case UFSHCD_STATE_EH_SCHEDULED_FATAL: > + /* > + * pm_runtime_get_sync() is used at error handling preparation > + * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's > + * PM ops, it can never be finished if we let SCSI layer keep > + * retrying it, which gets err handler stuck forever. Neither > + * can we let the scsi cmd pass through, because UFS is in bad > + * state, the scsi cmd may eventually time out, which will get > + * err handler blocked for too long. So, just fail the scsi cmd > + * sent from PM ops, err handler can recover PM error anyways. > + */ > + if (hba->pm_op_in_progress) { > + hba->force_reset = true; > + set_host_byte(cmd, DID_BAD_TARGET); > + cmd->scsi_done(cmd); > + goto out; > + } > + fallthrough; Hi Can, I know that this patch only moves the above code and that the above code has not been introduced by this patch. Anyway, is my understanding correct that ufshcd_err_handler() can change the host controller state from UFSHCD_STATE_EH_SCHEDULED_FATAL into UFSHCD_STATE_RESET and next into UFSHCD_STATE_OPERATIONAL? If so, if the above code completes a READ with status DID_BAD_TARGET and if recovery by the error handler succeeds, will that cause the filesystem above the UFS driver to change into read-only mode? If the above code completes a WRITE with status DID_BAD_TARGET, will that cause data corruption? Is there any other solution to prevent data corruption than merging the UFSHCD_STATE_EH_SCHEDULED_FATAL and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL back into a single state and changing the ufshcd_rpm_get_sync(hba) call in ufshcd_err_handling_prepare() into a pm_runtime_get_noresume() call? Thanks, Bart.