[PATCH 2/2] ata: libata: skip error analysis for commands that are not errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A NCQ error means that the device has aborted a single NCQ command and
halted further processing of queued commands.
To get the single NCQ command that caused the NCQ error, host software has
to read the NCQ error log, which also takes the device out of error state.

ata_eh_link_autopsy() starts off by calling ata_eh_analyze_ncq_error() to
read the NCQ error log, to find the offending command that caused the NCQ
error. ata_eh_analyze_ncq_error() marks the offending command by setting
qc->err_mask to AC_ERR_NCQ.

ata_eh_link_autopsy() then continues with further analysis for all
commands owned by libata EH.

However, once we have found the offending command, we know that all other
commands cannot be an error. (Theoretically a command could have timed out
just before the NCQ error happened (so EH was scheduled but did not yet
run), such command will have AC_ERR_TIMEOUT set in qc->err_mask.)

Therefore, after finding the offending command, we know that we can simply
skip the per command analysis for all commands that have not been marked
as error at this point, since we know that they have to be retried anyway.

Do this by changing ata_eh_analyze_ncq_error() to mark all non-failed
commands as ATA_QCFLAG_RETRY, and change the loop in ata_eh_link_autopsy()
to skip commands marked as ATA_QCFLAG_RETRY.

Signed-off-by: Niklas Cassel <niklas.cassel@xxxxxxx>
---
Flag ATA_QCFLAG_FAILED is kept since it means that the command is owned
by libata EH. A failed command will always have qc->err_mask set.
Perhaps flag ATA_QCFLAG_FAILED should be renamed to something like
ATA_QCFLAG_OWNED_BY_EH to further clarify this meaning, and that the flag
does not necessarily mean that it is an error.

 drivers/ata/libata-eh.c   | 1 +
 drivers/ata/libata-sata.c | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index bde15f855f70..34303ce67c14 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1955,6 +1955,7 @@ static void ata_eh_link_autopsy(struct ata_link *link)
 
 	ata_qc_for_each_raw(ap, qc, tag) {
 		if (!(qc->flags & ATA_QCFLAG_FAILED) ||
+		    qc->flags & ATA_QCFLAG_RETRY ||
 		    ata_dev_phys_link(qc->dev) != link)
 			continue;
 
diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index 6b2dcf3eb2fb..18ef14e749a0 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -1493,6 +1493,14 @@ void ata_eh_analyze_ncq_error(struct ata_link *link)
 		 */
 		qc->result_tf.status &= ~ATA_ERR;
 		qc->result_tf.error = 0;
+
+		/*
+		 * If we get a NCQ error, that means that a single command was
+		 * aborted. All other failed commands for our link should be
+		 * retried and has no business of going though further scrutiny
+		 * by ata_eh_link_autopsy().
+		 */
+		qc->flags |= ATA_QCFLAG_RETRY;
 	}
 
 	ehc->i.err_mask &= ~AC_ERR_DEV;
-- 
2.38.1




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux