Hello Hannes, sry that I only now reply to the other patches of the series. On Wed, Mar 01, 2017 at 10:15:17AM +0100, Hannes Reinecke wrote: > To detect if a failed command has been retried we must not > clear scmd->eh_eflags when EH finishes. > The flag should be persistent throughout the lifetime > of the command. > > Reviewed-by: Johannes Thumshirn <jthumshirn@xxxxxxx> > Signed-off-by: Hannes Reinecke <hare@xxxxxxx> > --- > Documentation/scsi/scsi_eh.txt | 14 +++++++------- > drivers/scsi/libsas/sas_scsi_host.c | 2 -- > drivers/scsi/scsi_error.c | 4 ++-- > include/scsi/scsi_eh.h | 1 + > 4 files changed, 10 insertions(+), 11 deletions(-) > > diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt > index 37eca00..4edb9c1c 100644 > --- a/Documentation/scsi/scsi_eh.txt > +++ b/Documentation/scsi/scsi_eh.txt > @@ -105,11 +105,14 @@ function > > 2. If the host supports asynchronous completion (as indicated by the > no_async_abort setting in the host template) scsi_abort_command() > - is invoked to schedule an asynchrous abort. If that fails > - Step #3 is taken. > + is invoked to schedule an asynchrous abort. > + Asynchronous abort are not invoked for commands which the > + SCSI_EH_ABORT_SCHEDULED flag is set (this indicates that the command > + already had been aborted once, and this is a retry which failed), > + or when the EH deadline is expired. In these case Step #3 is taken. > > - 2. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the > - command. See [1-3] for more information. > + 3. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the > + command. See [1-4] for more information. > > [1-3] Asynchronous command aborts > > @@ -263,7 +266,6 @@ scmd->allowed. > > 3. scmd recovered > ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd > - - clear scmd->eh_eflags > - scsi_setup_cmd_retry() > - move from local eh_work_q to local eh_done_q > LOCKING: none > @@ -456,8 +458,6 @@ except for #1 must be implemented by eh_strategy_handler(). > > - shost->host_failed is zero. > > - - Each scmd's eh_eflags field is cleared. > - > - Each scmd is in such a state that scsi_setup_cmd_retry() on the > scmd doesn't make any difference. > > diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c > index ee6b39a..87e5079 100644 > --- a/drivers/scsi/libsas/sas_scsi_host.c > +++ b/drivers/scsi/libsas/sas_scsi_host.c > @@ -613,8 +613,6 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * > SAS_DPRINTK("trying to find task 0x%p\n", task); > res = sas_scsi_find_task(task); > > - cmd->eh_eflags = 0; > - > switch (res) { > case TASK_IS_DONE: > SAS_DPRINTK("%s: task 0x%p is done\n", __func__, > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index cec439c..e1ca3b8 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -189,7 +189,6 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host *shost) > /* > * Retry after abort failed, escalate to next level. > */ > - scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED; > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_INFO, scmd, > "previous abort failed\n")); > @@ -933,6 +932,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, > ses->result = scmd->result; > ses->underflow = scmd->underflow; > ses->prot_op = scmd->prot_op; > + ses->eh_eflags = scmd->eh_eflags; > > scmd->prot_op = SCSI_PROT_NORMAL; > scmd->eh_eflags = 0; > @@ -996,6 +996,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) > scmd->result = ses->result; > scmd->underflow = ses->underflow; > scmd->prot_op = ses->prot_op; > + scmd->eh_eflags = ses->eh_eflags; > } > EXPORT_SYMBOL(scsi_eh_restore_cmnd); > > @@ -1140,7 +1141,6 @@ static int scsi_eh_reset(struct scsi_cmnd *scmd) > */ > void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q) > { > - scmd->eh_eflags = 0; > list_move_tail(&scmd->eh_entry, done_q); > } > EXPORT_SYMBOL(scsi_eh_finish_cmd); > diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h > index 98d366b..a25b328 100644 > --- a/include/scsi/scsi_eh.h > +++ b/include/scsi/scsi_eh.h > @@ -31,6 +31,7 @@ extern int scsi_get_sense_info_fld(const u8 * sense_buffer, int sb_len, > struct scsi_eh_save { > /* saved state */ > int result; > + int eh_eflags; > enum dma_data_direction data_direction; > unsigned underflow; > unsigned char cmd_len; > -- > 1.8.5.6 > So I think the code is good. The only thing I am missing is a bit of reasoning why we want to escalate immediately after an already retried command fails again. Anyway, would not require code-changes. Reviewed-by: Benjamin Block <bblock@xxxxxxxxxxxxxxxxxx> Beste Grüße / Best regards, - Benjamin Block -- Linux on z Systems Development / IBM Systems & Technology Group IBM Deutschland Research & Development GmbH Vorsitz. AufsR.: Martina Koederitz / Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294