[PATCH] scsi: qla2xxx: I/Os timing out on surprise removal of

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When removing an adapter through sysfs, some in flight I/Os can get 
stuck and take a while to complete (they actually timeout and are 
retried). We are not handling an early error exit from 
qla2xxx_eh_abort properly.

Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip") 
Signed-off-by: Bill Kuzeja <william.kuzeja@xxxxxxxxxxx>
---

When doing a sysfs remove of a QLogic adapter, the driver's remove
function gets called and we end up aborting all in progress I/Os.
Here is the code flow:

qla2x00_remove_one
  qla2x00_abort_isp_cleanup
    qla2x00_abort_all_cmds
      __qla2x00_abort_all_cmds
        qla2xxx_eh_abort

At the start of qla2xxx_eh_abort, there are some sanity checks done 
before actually sending the abort. One of these checks is a call to 
fc_block_scsi_eh. In the case of a sysfs remove, it turns out that this 
routine can exit with FAST_IO_FAIL.

When this occurs, we return back to __qla2x00_abort_all_cmds with an 
extra reference on sp (because the abort never gets sent). Originally, I 
remedied this kind of situation with another fix:

commit 4cd3b6ebff85 scsi: qla2xxx: Fix extraneous ref on sp's after adapter break

But this later added change complicated matters:

commit 45235022da99 scsi: qla2xxx: Fix driver unload by shutting down chip

Because the abort is now being done earlier in the teardown (through 
qla2x00_abort_isp_cleanup), in qla2xxx_eh_abort we make it past 
the first check because qla2x00_isp_reg_stat(ha) returns zero. When we
fail a few lines later in fc_block_scsi_eh, this error is not handled
properly in __qla2x00_abort_all_cmds and the I/O ends up hanging and 
timing out because of the extra reference.

For this fix, I will add this case to __qla2x00_abort_all_cmds where we
check to see if qla2xxx_eh_abort succeeded or not. 

This removes the extra reference in this additional early exit case. In 
my testing, this eliminates the timeouts and delays and the remove 
proceeds smoothly.

---
 drivers/scsi/qla2xxx/qla_os.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 42b8f0d..3ba3765 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1771,8 +1771,9 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha)
 					 * if immediate exit from
 					 * ql2xxx_eh_abort
 					 */
-					if (status == FAILED &&
-					    (qla2x00_isp_reg_stat(ha)))
+					if (((status == FAILED) &&
+					    (qla2x00_isp_reg_stat(ha))) ||
+					     (status == FAST_IO_FAIL))
 						atomic_dec(
 						    &sp->ref_count);
 				}
-- 
1.8.3.1




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux