On 9/29/22 9:13 AM, Martin Wilck wrote: > On Wed, 2022-09-28 at 21:53 -0500, Mike Christie wrote: >> This simplifies sd_spinup_disk so scsi-ml retries errors for it. Note >> that >> we retried specifically on a UA and also if scsi_status_is_good >> returned >> failed which could happen for all check conditions, so in this patch >> we >> don't check for only UAs. >> >> We do not handle the outside loop's retries because we want to sleep >> between tried and we don't support that yet. >> >> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> >> --- >> drivers/scsi/sd.c | 76 ++++++++++++++++++++++++++++----------------- >> -- >> 1 file changed, 45 insertions(+), 31 deletions(-) >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index a35c089c3097..716e0c8ffa57 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -2064,50 +2064,64 @@ sd_spinup_disk(struct scsi_disk *sdkp) >> { >> unsigned char cmd[10]; >> unsigned long spintime_expire = 0; >> - int retries, spintime; >> + int spintime; >> unsigned int the_result; >> struct scsi_sense_hdr sshdr; >> int sense_valid = 0; >> + struct scsi_failure failures[] = { >> + { >> + .sense = SCMD_FAILURE_SENSE_ANY, >> + .asc = SCMD_FAILURE_ASC_ANY, >> + .ascq = SCMD_FAILURE_ASCQ_ANY, >> + .allowed = 3, >> + .result = SAM_STAT_CHECK_CONDITION, >> + }, >> + { >> + /* >> + * Retry scsi status and host errors that >> return >> + * failure in scsi_status_is_good. >> + */ >> + .result = SAM_STAT_BUSY | >> + SAM_STAT_RESERVATION_CONFLICT | >> + SAM_STAT_TASK_SET_FULL | >> + SAM_STAT_ACA_ACTIVE | >> + SAM_STAT_TASK_ABORTED | > > I fail to understand how bitwise-or would work here. IIUC, this tries > to replicate the logic to retry if (!scsi_status_is_good()). The result > of this bitwise-or operation is 0x78, which matches all SAM_ codes > except SAM_STAT_GOOD, SAM_STAT_CHECK_CONDITION and > SAM_STAT_CONDITION_MET. SAM_STAT_CHECK_CONDITION is covered by the > first failure. But unless I'm mistaken, we'd now retry on > SAM_STAT_INTERMEDIATE, SAM_STAT_INTERMEDIATE_CONDITION_MET, and > SAM_STAT_COMMAND_TERMINATED, on which the old code did not retry. Am I > overlooking something? I'm not sure what happened. For some reason I thought they were bitmaps. Will fix. > > At least this deserves an in-depth comment; in general, as noted > for patch 02/35, I find using bitwise or for SAM status counter- > intuitive. > >> + DID_NO_CONNECT << 16, > > Shouldn't .allowed be set to 3 here? OTOH that would cause the number Forgot the 3. > of retries to add up to 6, see my reply to 22/35. But don't see what > effect a failure with allowed = 0 would have. > > > Regards > Martin >