Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/30/2016 01:43 PM, Andrey Grodzovsky wrote:
Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Actually, it is _not_ the erase command which times out, it's the successive commands which time out, as the controller is unable to process them while erase is running. I suspect a bug in the SAT-layer from the mpt3sas firmware, which simply does not return 'busy' for additional commands when erase is in progress. That being said, this issue was obscured prior to implementing asynchronous aborts, as originally a timeout would be invoking SCSI EH, which would wait for all outstanding commands to complete. So by the time SCSI EH was invoked the erase command was already completed, allowing for a successful retry of the failing command. With asynchronous aborts we don't have this option, as the abort will succeed, but the command cannot be retried as the original erase command is still running.

In the light of the above I guess we need something like the attached patch. I'm not utterly proud of if, but I guess it's the best we can do for the moment.

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
>From 1556746987c3b4c1a1a4705625280b1136554f89 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@xxxxxxx>
Date: Sun, 30 Oct 2016 14:24:44 +0100
Subject: [PATCH] mpt3sas: hack: disable concurrent commands for ATA_16/ATA_12

There's a bug in the mpt3sas driver/firmware which would not return
BUSY if it's busy processing requests (eg 'erase') and cannot
respond to other commands. Hence these commands will timeout
and eventually start the error handler.
This patch disallows request processing whenever an ATA_12 or
ATA_16 command is received, thereby avoiding this problem.

Signed-off-by: Hannes Reinecke <hare@xxxxxxxx>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 97987e7..18b9f09 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4096,6 +4096,13 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 	    sas_device_priv_data->block)
 		return SCSI_MLQUEUE_DEVICE_BUSY;
 
+	/*
+	 * Hack: block the device for any ATA_12/ATA_16 command
+	 */
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_block(scmd->device, sas_device_priv_data);
+	}
 	if (scmd->sc_data_direction == DMA_FROM_DEVICE)
 		mpi_control = MPI2_SCSIIO_CONTROL_READ;
 	else if (scmd->sc_data_direction == DMA_TO_DEVICE)
@@ -4835,6 +4842,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 
  out:
 
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_unblock(scmd->device, sas_device_priv_data);
+	}
 	scsi_dma_unmap(scmd);
 
 	scmd->scsi_done(scmd);
-- 
2.6.6


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]