Re: [PATCH] ata: libata-core: do not issue non-internal commands once EH is pending

Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> · Mon, 7 Nov 2022 22:40:00 +0900

On 11/7/22 18:51, Niklas Cassel wrote:
> scsi_queue_rq() will check if scsi_host_in_recovery() (state is
> SHOST_RECOVERY), and if so, it will _not_ issue a command via:
> scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd())
> -> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue()
> 
> Before commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
> when receiving a TFES error IRQ, this is the call chain:
> ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() ->
> ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() ->
> blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) ->
> scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY)
> 
> Which meant that as soon as the error IRQ was serviced, no additional
> commands sent from the block layer (scsi_queue_rq()) would be sent
> down to the device. (ATA internal commands originating for ATA EH
> could of course still be sent.)
> 
> However, after commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
> scsi_times_out() would instead result in a call to scsi_abort_command() ->
> queue_delayed_work(). work function: scmd_eh_abort_handler() would call
> scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY).
> 
> (It was possible to get the old behavior if host->hostt->no_async_abort
> was set, but libata never used it, and this option was completely removed
> in commit a06586325f37 ("scsi: make asynchronous aborts mandatory"))
> 
> Additionally, later in commit 358f70da49d7 ("blk-mq: make
> blk_abort_request() trigger timeout path") blk_abort_request() was changed
> to also call the abort callback asynchronously.
> 
> Tejun mentioned that:
> "Note that this makes blk_abort_request() asynchronous - it initiates
> abortion but the actual termination will happen after a short while,
> even when the caller owns the request.  AFAICS, SCSI and ATA should be
> fine with that and I think mtip32xx and dasd should be safe but not
> completely sure."
> 
> So now, after the TFES irq has been serviced, we need to wait for two
> different workqueues to run their work, before the SHOST_RECOVERY state
> gets set.
> 
> While the ATA specification states that a device should return command
> aborted for all commands queued after the device has entered error state,
> since ATA only keeps the sense data for the latest command (in non-NCQ
> case), we really don't want to send block layer commands to the device
> after it has entered error state. (Only ATA EH commands should be sent,
> to read the sense data etc.)
> 
> While we could just call scsi_host_set_state(shost, SHOST_RECOVERY) from
> ata_qc_schedule_eh() directly, that smells like a layering violation,
> So instead of doing that, add an additional check against the libata's
> own EH flag(s) in the existing qc_defer callback.
> 
> Fixes: e494f6a72839 ("[SCSI] improved eh timeout handler")
> Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx>
> ---
>  drivers/ata/libata-core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 884ae73b11ea..ab6c69be3d4a 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4458,6 +4458,9 @@ int ata_std_qc_defer(struct ata_queued_cmd *qc)
>  {
>  	struct ata_link *link = qc->dev->link;
>  

The check looks OK but I think it is in the wrong function. Not all
drivers use ata_std_qc_defer(). Some implement their own. So to avoid
leaking this check depending on the LLD, I think it should be done from
the only place where ->qc_defer method is called: ata_scsi_translate().

Also, it would be nice to add a comment above the check explaining
(ideally shortly) why it is done.

> +	if (qc->ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS))
> +		return ATA_DEFER_LINK;
> +
>  	if (ata_is_ncq(qc->tf.protocol)) {
>  		if (!ata_tag_valid(link->active_tag))
>  			return 0;

-- 
Damien Le Moal
Western Digital Research