Re: [PATCH v1] xprtrdma: Fix corner cases when handling device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michal-

Have you tried this one? If we can nail this down now, I can easily
get it into the v4.17 merge window.


> On Mar 14, 2018, at 10:42 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
> Michal Kalderon has found some corner cases around device unload
> with active NFS mounts that I didn't have the imagination to test
> when xprtrdma device removal was added last year.
> 
> - The ULP device removal handler is responsible for deallocating
>  the PD. That wasn't clear to me initially, and my own testing
>  suggested it was not necessary, but that is incorrect.
> 
> - The transport destruction path can no longer assume that there
>  is a valid ID.
> 
> - When destroying a transport, ensure that ib_free_cq() is not
>  invoked on a CQ that was already released.
> 
> Reported-by: Michal Kalderon <Michal.Kalderon@xxxxxxxxxx>
> Fixes: bebd031866ca ("xprtrdma: Support unplugging an HCA from ...")
> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> net/sunrpc/xprtrdma/verbs.c |   13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
> index d19ea02..c284ee7 100644
> --- a/net/sunrpc/xprtrdma/verbs.c
> +++ b/net/sunrpc/xprtrdma/verbs.c
> @@ -251,7 +251,6 @@
> 		wait_for_completion(&ia->ri_remove_done);
> 
> 		ia->ri_id = NULL;
> -		ia->ri_pd = NULL;
> 		ia->ri_device = NULL;
> 		/* Return 1 to ensure the core destroys the id. */
> 		return 1;
> @@ -449,7 +448,9 @@
> 		ia->ri_id->qp = NULL;
> 	}
> 	ib_free_cq(ep->rep_attr.recv_cq);
> +	ep->rep_attr.recv_cq = NULL;
> 	ib_free_cq(ep->rep_attr.send_cq);
> +	ep->rep_attr.send_cq = NULL;
> 
> 	/* The ULP is responsible for ensuring all DMA
> 	 * mappings and MRs are gone.
> @@ -462,6 +463,8 @@
> 		rpcrdma_dma_unmap_regbuf(req->rl_recvbuf);
> 	}
> 	rpcrdma_mrs_destroy(buf);
> +	ib_dealloc_pd(ia->ri_pd);
> +	ia->ri_pd = NULL;
> 
> 	/* Allow waiters to continue */
> 	complete(&ia->ri_remove_done);
> @@ -629,14 +632,16 @@
> {
> 	cancel_delayed_work_sync(&ep->rep_connect_worker);
> 
> -	if (ia->ri_id->qp) {
> +	if (ia->ri_id && ia->ri_id->qp) {
> 		rpcrdma_ep_disconnect(ep, ia);
> 		rdma_destroy_qp(ia->ri_id);
> 		ia->ri_id->qp = NULL;
> 	}
> 
> -	ib_free_cq(ep->rep_attr.recv_cq);
> -	ib_free_cq(ep->rep_attr.send_cq);
> +	if (ep->rep_attr.recv_cq)
> +		ib_free_cq(ep->rep_attr.recv_cq);
> +	if (ep->rep_attr.send_cq)
> +		ib_free_cq(ep->rep_attr.send_cq);
> }
> 
> /* Re-establish a connection after a device removal event.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux