Hi Michal- Have you tried this one? If we can nail this down now, I can easily get it into the v4.17 merge window. > On Mar 14, 2018, at 10:42 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > Michal Kalderon has found some corner cases around device unload > with active NFS mounts that I didn't have the imagination to test > when xprtrdma device removal was added last year. > > - The ULP device removal handler is responsible for deallocating > the PD. That wasn't clear to me initially, and my own testing > suggested it was not necessary, but that is incorrect. > > - The transport destruction path can no longer assume that there > is a valid ID. > > - When destroying a transport, ensure that ib_free_cq() is not > invoked on a CQ that was already released. > > Reported-by: Michal Kalderon <Michal.Kalderon@xxxxxxxxxx> > Fixes: bebd031866ca ("xprtrdma: Support unplugging an HCA from ...") > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > net/sunrpc/xprtrdma/verbs.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c > index d19ea02..c284ee7 100644 > --- a/net/sunrpc/xprtrdma/verbs.c > +++ b/net/sunrpc/xprtrdma/verbs.c > @@ -251,7 +251,6 @@ > wait_for_completion(&ia->ri_remove_done); > > ia->ri_id = NULL; > - ia->ri_pd = NULL; > ia->ri_device = NULL; > /* Return 1 to ensure the core destroys the id. */ > return 1; > @@ -449,7 +448,9 @@ > ia->ri_id->qp = NULL; > } > ib_free_cq(ep->rep_attr.recv_cq); > + ep->rep_attr.recv_cq = NULL; > ib_free_cq(ep->rep_attr.send_cq); > + ep->rep_attr.send_cq = NULL; > > /* The ULP is responsible for ensuring all DMA > * mappings and MRs are gone. > @@ -462,6 +463,8 @@ > rpcrdma_dma_unmap_regbuf(req->rl_recvbuf); > } > rpcrdma_mrs_destroy(buf); > + ib_dealloc_pd(ia->ri_pd); > + ia->ri_pd = NULL; > > /* Allow waiters to continue */ > complete(&ia->ri_remove_done); > @@ -629,14 +632,16 @@ > { > cancel_delayed_work_sync(&ep->rep_connect_worker); > > - if (ia->ri_id->qp) { > + if (ia->ri_id && ia->ri_id->qp) { > rpcrdma_ep_disconnect(ep, ia); > rdma_destroy_qp(ia->ri_id); > ia->ri_id->qp = NULL; > } > > - ib_free_cq(ep->rep_attr.recv_cq); > - ib_free_cq(ep->rep_attr.send_cq); > + if (ep->rep_attr.recv_cq) > + ib_free_cq(ep->rep_attr.recv_cq); > + if (ep->rep_attr.send_cq) > + ib_free_cq(ep->rep_attr.send_cq); > } > > /* Re-establish a connection after a device removal event. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html