On 8/1/24 5:22 AM, Bart Van Assche wrote: > Hi, > > Recently I noticed that a particular UFS-based device does not resume > correctly. The logs of the device show that sd_start_stop_device() does > not retry the START STOP UNIT command if the device reports a unit > attention. I think that's a bug in the SCSI core. The following hack > makes resume work again. I think this confirms my understanding of this > issue (sd_start_stop_device() sets RQF_PM): > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index da7dac77f8cd..e21becc5bcf9 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -1816,6 +1816,8 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd) > * assume caller has checked sense and determined > * the check condition was retryable. > */ > + if (req->rq_flags & RQF_PM) > + return false; > if (req->cmd_flags & REQ_FAILFAST_DEV || blk_rq_is_passthrough(req)) > return true; > > My understanding is that SCSI pass-through commands submitted from > user space must not be retried. Are there any objections against > modifying the behavior of the SCSI core such that it retries > REQ_OP_DRV_* operations submitted by the SCSI core, as illustrated > by the pseudo-code below? Looking at the code, e.g. sd_start_stop_device(): res = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, NULL, 0, SD_TIMEOUT, sdkp->max_retries, &exec_args); It seems that it is expected that the retry count will be honored. But that indeed is not the case as scsi_noretry_cmd() will always return false for REQ_OP_DRV_* commands. So may be we should have a RQF_USER_OP_DRV flag to differentiate user REQ_OP_DRV_* passthrough commands from internally issued REQ_OP_DRV_* commands. Or the reverse flag, e.g. RQF_INTERNAL_OP_DRV, that we can set in e.g. scsi_execute_cmnd(). > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index da7dac77f8cd..e21becc5bcf9 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -1816,6 +1816,12 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd) > * assume caller has checked sense and determined > * the check condition was retryable. > */ > - if (req->cmd_flags & REQ_FAILFAST_DEV || blk_rq_is_passthrough(req)) > - return true; > + if (req->cmd_flags & REQ_FAILFAST_DEV) > + return true; > + if (/* submitted by the SCSI core */) > + return false; > + if (blk_rq_is_passthrough(req)) > + return true; > > Thanks, > > Bart. -- Damien Le Moal Western Digital Research