Patch "scsi: cxlflash: Fix to avoid EEH and host reset collisions" has been added to the 4.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    scsi: cxlflash: Fix to avoid EEH and host reset collisions

to the 4.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     scsi-cxlflash-fix-to-avoid-eeh-and-host-reset-collisions.patch
and it can be found in the queue-4.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 1d3324c382b1a617eb567e3650dcb51f22dfec9a Mon Sep 17 00:00:00 2001
From: "Matthew R. Ochs" <mrochs@xxxxxxxxxxxxxxxxxx>
Date: Fri, 2 Sep 2016 15:39:30 -0500
Subject: scsi: cxlflash: Fix to avoid EEH and host reset collisions

From: Matthew R. Ochs <mrochs@xxxxxxxxxxxxxxxxxx>

commit 1d3324c382b1a617eb567e3650dcb51f22dfec9a upstream.

The EEH reset handler is ignorant to the current state of the driver
when processing a frozen event and initiating a device reset. This can
be an issue if an EEH event occurs while a user or stack initiated reset
is executing. More specifically, if an EEH occurs while the SCSI host
reset handler is active, the reset initiated by the EEH thread will
likely collide with the host reset thread. This can leave the device in
an inconsistent state, or worse, cause a system crash.

As a remedy, the EEH handler is updated to evaluate the device state and
take appropriate action (proceed, wait, or disconnect host). The host
reset handler is also updated to handle situations where an EEH occurred
during a host reset. In such situations, the host reset handler will
delay reporting back a success to give the EEH reset an opportunity to
complete.

Signed-off-by: Matthew R. Ochs <mrochs@xxxxxxxxxxxxxxxxxx>
Acked-by: Uma Krishnan <ukrishn@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 drivers/scsi/cxlflash/main.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1962,6 +1962,11 @@ retry:
  * cxlflash_eh_host_reset_handler() - reset the host adapter
  * @scp:	SCSI command from stack identifying host.
  *
+ * Following a reset, the state is evaluated again in case an EEH occurred
+ * during the reset. In such a scenario, the host reset will either yield
+ * until the EEH recovery is complete or return success or failure based
+ * upon the current device state.
+ *
  * Return:
  *	SUCCESS as defined in scsi/scsi.h
  *	FAILED as defined in scsi/scsi.h
@@ -1993,7 +1998,8 @@ static int cxlflash_eh_host_reset_handle
 		} else
 			cfg->state = STATE_NORMAL;
 		wake_up_all(&cfg->reset_waitq);
-		break;
+		ssleep(1);
+		/* fall through */
 	case STATE_RESET:
 		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
 		if (cfg->state == STATE_NORMAL)
@@ -2534,6 +2540,9 @@ static void drain_ioctls(struct cxlflash
  * @pdev:	PCI device struct.
  * @state:	PCI channel state.
  *
+ * When an EEH occurs during an active reset, wait until the reset is
+ * complete and then take action based upon the device state.
+ *
  * Return: PCI_ERS_RESULT_NEED_RESET or PCI_ERS_RESULT_DISCONNECT
  */
 static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
@@ -2547,6 +2556,10 @@ static pci_ers_result_t cxlflash_pci_err
 
 	switch (state) {
 	case pci_channel_io_frozen:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+		if (cfg->state == STATE_FAILTERM)
+			return PCI_ERS_RESULT_DISCONNECT;
+
 		cfg->state = STATE_RESET;
 		scsi_block_requests(cfg->host);
 		drain_ioctls(cfg);


Patches currently in stable-queue which might be from mrochs@xxxxxxxxxxxxxxxxxx are

queue-4.4/scsi-cxlflash-fix-to-avoid-eeh-and-host-reset-collisions.patch
queue-4.4/scsi-cxlflash-scan-host-only-after-the-port-is-ready-for-i-o.patch
queue-4.4/scsi-cxlflash-improve-eeh-recovery-time.patch



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]