Re: FAILED: patch "[PATCH] scsi: lpfc: Fix bad ndlp ptr in xri aborted handling" failed to apply to 5.3-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 14, 2019 at 04:02:44PM +0100, gregkh@xxxxxxxxxxxxxxxxxxx wrote:

The patch below does not apply to the 5.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@xxxxxxxxxxxxxxx>.

thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From 324e1c402069e8d277d2a2b18ce40bde1265b96a Mon Sep 17 00:00:00 2001
From: James Smart <jsmart2021@xxxxxxxxx>
Date: Fri, 18 Oct 2019 14:18:21 -0700
Subject: [PATCH] scsi: lpfc: Fix bad ndlp ptr in xri aborted handling

In cases where I/O may be aborted, such as driver unload or link bounces,
the system will crash based on a bad ndlp pointer.

Example:
 RIP: 0010:lpfc_sli4_abts_err_handler+0x15/0x140 [lpfc]
 ...
 lpfc_sli4_io_xri_aborted+0x20d/0x270 [lpfc]
 lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x84/0x170 [lpfc]
 lpfc_sli4_fp_handle_cqe+0xc2/0x480 [lpfc]
 __lpfc_sli4_process_cq+0xc6/0x230 [lpfc]
 __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc]
 process_one_work+0x14c/0x390

Crash was caused by a bad ndlp address passed to I/O indicated by the XRI
aborted CQE.  The address was not NULL so the routine deferenced the ndlp
ptr. The bad ndlp also caused the lpfc_sli4_io_xri_aborted to call an
erroneous io handler.  Root cause for the bad ndlp was an lpfc_ncmd that
was aborted, put on the abort_io list, completed, taken off the abort_io
list, sent to lpfc_release_nvme_buf where it was put back on the abort_io
list because the lpfc_ncmd->flags setting LPFC_SBUF_XBUSY was not cleared
on the final completion.

Rework the exchange busy handling to ensure the flags are properly set for
both scsi and nvme.

Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Cc: <stable@xxxxxxxxxxxxxxx> # v5.1+
Link: https://lore.kernel.org/r/20191018211832.7917-6-jsmart2021@xxxxxxxxx
Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx>
Signed-off-by: James Smart <jsmart2021@xxxxxxxxx>
Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>

For 5.3 there were context changes needed as a result of

	c00f62e6c546 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")

I've fixed up the patch and queued it for 5.3.

--
Thanks,
Sasha



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux