Re: [PATCH 07/19] lpfc: Fix IO failure during hba reset testing with nvme io.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/24/2018 11:45 PM, James Smart wrote:
> A stress test repeatedly resetting the adapter while performing
> io would eventually report I/O failures and missing nvme namespaces.
> 
> The driver was setting the nvmefc_fcp_req->private pointer to NULL
> during the IO completion routine before upcalling done().
> If the transport was also running an abort for that IO, the driver
> would fail the abort with message 6140. Failing the abort is not
> allowed by the nvme-fc transport, as it mandates that the io must be
> returned back to the transport. As that does not happen, the transport
> controller delete has an outstanding reference and can't complete
> teardown.
> 
> Remove the NULL'ing of the private pointer in the nvmefc request.
> The driver simply overwrites this value on each IO start.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx>
> Signed-off-by: James Smart <james.smart@xxxxxxxxxxxx>
> ---
>  drivers/scsi/lpfc/lpfc_nvme.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
> index 81e3a4f10c3c..92643ffa79c3 100644
> --- a/drivers/scsi/lpfc/lpfc_nvme.c
> +++ b/drivers/scsi/lpfc/lpfc_nvme.c
> @@ -804,7 +804,6 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn,
>  	struct nvme_fc_cmd_iu *cp;
>  	struct lpfc_nvme_rport *rport;
>  	struct lpfc_nodelist *ndlp;
> -	struct lpfc_nvme_fcpreq_priv *freqpriv;
>  	struct lpfc_nvme_lport *lport;
>  	unsigned long flags;
>  	uint32_t code, status;
> @@ -980,8 +979,6 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn,
>  			phba->cpucheck_cmpl_io[lpfc_ncmd->cpu]++;
>  	}
>  #endif
> -	freqpriv = nCmd->private;
> -	freqpriv->nvme_buf = NULL;
>  
>  	/* NVME targets need completion held off until the abort exchange
>  	 * completes unless the NVME Rport is getting unregistered.
> 
I would avoid that if possible.
By not zeroing the pointers we run into the risk of executing the wrong
callback on stale commands.
Can't you just modify the abort handling to always return 'true' if this
condition is hit?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@xxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux