+ linux-scsi, Martin and James On 11/8/22 08:57, Niklas Cassel wrote: > On Mon, Nov 07, 2022 at 05:01:53PM +0000, John Garry wrote: >> On 07/11/2022 16:10, Niklas Cassel wrote: >>> scsi_queue_rq() will check if scsi_host_in_recovery() (state is >>> SHOST_RECOVERY), and if so, it will _not_ issue a command via: >>> scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd()) >>> -> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue() >>> >>> Before commit e494f6a72839 ("[SCSI] improved eh timeout handler"), >>> when receiving a TFES error IRQ, this was the call chain: >>> ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() -> >>> ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() -> >>> blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) -> >>> scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY) >>> >>> Which meant that as soon as the error IRQ was serviced, no additional >>> commands sent from the block layer (scsi_queue_rq()) would be sent down >>> to the device. (ATA internal commands originating for ATA EH could of >>> course still be sent.) >>> >>> However, after commit e494f6a72839 ("[SCSI] improved eh timeout handler"), >>> scsi_times_out() would instead result in a call to scsi_abort_command() -> >>> queue_delayed_work(). work function: scmd_eh_abort_handler() would call >>> scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY). >>> >>> (It was possible to get the old behavior if host->hostt->no_async_abort >>> was set, but libata never used it, and this option was completely removed >>> in commit a06586325f37 ("scsi: make asynchronous aborts mandatory")) >>> >>> Additionally, later in commit 358f70da49d7 ("blk-mq: make >>> blk_abort_request() trigger timeout path"), blk_abort_request() was changed >>> to also call the abort callback asynchronously. >>> >>> So now, after the TFES error irq has been serviced, we need to wait for >>> two different workqueues to run their work, before the SHOST_RECOVERY >>> state gets set. >>> >>> While the ATA specification states that a device should return command >>> aborted for all commands queued after the device has entered error state, >>> since ATA only keeps the sense data for the latest command (in non-NCQ >>> case), we really don't want to send block layer commands to the device >>> after it has entered error state. (Only ATA EH commands should be sent, >>> to read the sense data etc.) >>> >>> While we could just call scsi_host_set_state(shost, SHOST_RECOVERY) from >>> ata_qc_schedule_eh() directly, that might be a layering violation. >>> So instead of doing that, add an additional check against the libata's own >>> EH flag(s) before calling the qc_defer callback. >>> >>> Fixes: e494f6a72839 ("[SCSI] improved eh timeout handler") >>> Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx> >>> --- >>> Changes since v1: >>> -Implemented review comments from Damien. >>> >>> drivers/ata/libata-scsi.c | 20 ++++++++++++++++++++ >>> 1 file changed, 20 insertions(+) >>> >>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c >>> index 4cb914103382..383a208f5f99 100644 >>> --- a/drivers/ata/libata-scsi.c >>> +++ b/drivers/ata/libata-scsi.c >>> @@ -1736,6 +1736,26 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd, >>> if (xlat_func(qc)) >>> goto early_finish; >>> + /* >>> + * scsi_queue_rq() will defer commands when in state SHOST_RECOVERY. >>> + * >>> + * When getting an error interrupt, ata_port_abort() will be called, >>> + * which ends up calling ata_qc_schedule_eh() on all QCs. >>> + * >>> + * ata_qc_schedule_eh() will call ata_eh_set_pending() and then call >>> + * blk_abort_request() on the given QC. blk_abort_request() will >>> + * asynchronously end up calling scsi_eh_scmd_add(), which will set >>> + * the state to SHOST_RECOVERY and wake up SCSI EH. >>> + * >>> + * In order to avoid requests from being issued to the device from the >>> + * time ata_eh_set_pending() is called, to the time scsi_eh_scmd_add() >>> + * sets the state to SHOST_RECOVERY, we defer requests here as well. >>> + */ >>> + if (ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS)) { >>> + rc = ATA_DEFER_LINK; >>> + goto defer; >> >> Could we move this check earlier? I mean, we didn't need to have the qc >> alloc'ed and xlat'ed for this check to be done, right? > > Sure, we could put it in e.g. ata_scsi_queuecmd() or __ata_scsi_queuecmd(). > > > Or, perhaps it is just time to accept that ATA EH is so interconnected with > SCSI EH, so the simplest thing is just to do: > > --- a/drivers/ata/libata-eh.c > +++ b/drivers/ata/libata-eh.c > @@ -913,6 +913,7 @@ void ata_qc_schedule_eh(struct ata_queued_cmd *qc) > > qc->flags |= ATA_QCFLAG_FAILED; > ata_eh_set_pending(ap, 1); > + scsi_host_set_state(ap->scsi_host, SHOST_RECOVERY); Why put this in this function ? Nothing in ata_qc_schedule_eh() calls scsi_schedule_eh() or scsi_eh_scmd_add(), which set that state. It looks like ata_port->ops->sched_eh(), that is, ata_std_sched_eh(), would be a better place for this. > > /* The following will fail if timeout has already expired. > * ata_scsi_error() takes care of such scmds on EH entry. > > > Which appears to work just as well as the patch in $subject. > > In commit ee7863bc68fa ("[PATCH] SCSI: implement shost->host_eh_scheduled") > Tejun mentioned that "... libata is planning to depart from SCSI, so, for the > time being, libata will be using SCSI EH to handle such exceptions." > > Now, 16 years later, ATA is still using SCSI EH (see ata_port_schedule_eh() and > ata_std_sched_eh()) to schedule EH in the case when there are no QCs to abort: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.1-rc4#n1004 > > Damien, thoughts? I like this simpler fix better, but it does introduce a risk of (again) having problems with ata EH if scsi EH trigger/timing is changed. Unlikely now, but as this fix proves, not unheard of. The v2 change is fine too, modulo John's suggestion, which I agree with. At least it is consistent with the ata internal eh state accounting, so self contained within libata and somewhat less dependent on scsi state machine. Martin ? James ? Thoughts ? -- Damien Le Moal Western Digital Research