RE: SCSI: usage of DID_REQUEUE vs DID_RESET for returning SCSI commands to be retried

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Adding direct email addresses of few people to avoid any filters.

Hannes/Martin/James/Tomas/Christoph,

Can you please comment on this?

Thanks,
Sumit

>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@xxxxxxxxxxxx]
>Sent: Tuesday, December 13, 2016 6:50 PM
>To: 'linux-scsi'
>Subject: SCSI: usage of DID_REQUEUE vs DID_RESET for returning SCSI
>commands to be retried
>
>Hi all,
>
>I have query regarding usage of host_byte DID_REQUEUE vs DID_RESET
returned
>by LLD to SCSI mid layer.
>
>Let me give some background here.
>I am using megaraid_sas controller. megaraid_sas driver returns all
outstanding
>SCSI commands back to SCSI layer with DID_RESET host_byte before
resetting
>controller.
>The intent is- all these commands returned with DID_RESET before
controller
>reset should be retried by SCSI.
>
>In few distros, I have observed that if SYNCHRONIZE_CACHE command(should
be
>applicable for all non Read write commands) is outstanding before
resetting
>controller  and driver returns those commands back with DID_RESET then
>SYNCHRONIZE_CACHE command not retried but failed to upper layer but other
>READ/WRITE commands were not failed but retried. I was running filesystem
IOs
>and SYNHRONIZE_CACHE returned with error cause filesystem to go in READ
only
>mode.
>
>Later I checked and realized below commit was missing in that distro
kernel and
>seems to cause the problem-
>
>a621bac scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands
>
>But distro kernel has below commit-
>
>89fb4cd scsi: handle flush errors properly
>
>Issue does not hit on the kernels which don't have both commits but hits
when
>commit- "89fb4cd scsi: handle flush errors properly " is there but
commit-
>"a621bac scsi_lib: correctly retry failed zero length REQ_TYPE_FS
commands" is
>missing.
>
>This issue is observed with mpt3sas driver as well and should be
applicable to all
>LLDs returning non read write commands with DID_RESET.
>
>Returning DID_REQUEUE instead of DID_RESET from driver solves the problem
>irrespective of whether these above mentioned commits are there or not in
>kernel.
>So I am thinking to use DID_REQUEUE instead of DID_RESET in megaraid_sas
>driver for all SCSI commands(not only limited to SYNCHRONIZE_CACHE or non
>read write commands) before resetting controller. As I mentioned intent
is those
>outstanding commands returned by driver before doing controller reset
should be
>retried and as soon as reset is complete, driver will be processing those
>commands. There is no sense key associated with SCSI commands returned.
>
>I browsed SCSI code and get to know DID_REQUEUE causes command to be
>retried by calling- scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY).
>This seems to be good enough for our requirement of commands to be
re-tried
>by SCSI layer.
>
>Please provide feedback if anyone forsee any issue with using DID_REQUEUE
for
>this use case.
>I will be doing some testing with DID_REQUEUE in place of DID_RESET in
>megaraid_sas driver.
>
>I have attached here a proposed patch for megaraid_sas driver.
>If there are no concerns, I will send this patch to SCSI mailing list.
>
>Thanks,
>Sumit
--- Begin Message ---
Driver returns outstanding SCSI commands to SCSI layer with host_byte
set to DID_RESET before resetting controller so that SCSI layer should
retry
those commands.
Sending DID_RESET for non RW commands(e.g SYNCHRONIZE_CACHE) may cause
those commands getting failed and returning I/O erros on few distros which
has included commit- 89fb4cd scsi: handle flush errors properly but missed
commit- a621bac scsi_lib: correctly retry failed zero length REQ_TYPE_FS
commands.
However Read write commands returned with DID_RESET are not failed but
retried.

DID_REQUEUE seems safer to use instead of DID_RESET for all outstanding
commands before doing chip reset as it serves purpose of getting all
commands
to be retried by SCSI layer.

Signed-off-by: Sumit Saxena <sumit.saxena@xxxxxxxxxxxx>
Signed-off-by: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx>
---
 drivers/scsi/megaraid/megaraid_sas_base.c   | 4 ++--
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
b/drivers/scsi/megaraid/megaraid_sas_base.c
index 6484c38..959ce3e 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1662,7 +1662,7 @@ megasas_queue_command(struct Scsi_Host *shost,
struct scsi_cmnd *scmd)
 	/* Check for an mpio path and adjust behavior */
 	if (atomic_read(&instance->adprecovery) ==
MEGASAS_ADPRESET_SM_INFAULT) {
 		if (megasas_check_mpio_paths(instance, scmd) ==
-		    (DID_RESET << 16)) {
+		    (DID_REQUEUE << 16)) {
 			return SCSI_MLQUEUE_HOST_BUSY;
 		} else {
 			scmd->result = DID_NO_CONNECT << 16;
@@ -2483,7 +2483,7 @@ static int megasas_wait_for_outstanding(struct
megasas_instance *instance)
 						struct megasas_cmd, list);
 			list_del_init(&reset_cmd->list);
 			if (reset_cmd->scmd) {
-				reset_cmd->scmd->result = DID_RESET << 16;
+				reset_cmd->scmd->result = DID_REQUEUE <<
16;
 				dev_notice(&instance->pdev->dev, "%d:%p
reset [%02x]\n",
 					reset_index, reset_cmd,
 					reset_cmd->scmd->cmnd[0]);
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 24778ba..58cbe30 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -3362,7 +3362,7 @@ int megasas_check_mpio_paths(struct megasas_instance
*instance,
 	struct scsi_cmnd *scmd)
 {
 	struct megasas_instance *peer_instance = NULL;
-	int retval = (DID_RESET << 16);
+	int retval = (DID_REQUEUE << 16);
 
 	if (instance->peerIsPresent) {
 		peer_instance = megasas_get_peer_instance(instance);
-- 
2.4.11


--- End Message ---

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux