> From: Kalderon, Michal > Sent: Monday, March 19, 2018 7:22 PM > To: 'Chuck Lever' <chuck.lever@xxxxxxxxxx> > Cc: linux-rdma@xxxxxxxxxxxxxxx > Subject: RE: [PATCH v1] xprtrdma: Fix corner cases when handling device > removal > > > From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx] > > Sent: Monday, March 19, 2018 4:59 PM > > To: Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx> > > Cc: linux-rdma@xxxxxxxxxxxxxxx > > Subject: Re: [PATCH v1] xprtrdma: Fix corner cases when handling > > device removal > > > > Hi Michal- > > > > Have you tried this one? If we can nail this down now, I can easily > > get it into the v4.17 merge window. > > Yes, looks good thanks. Chuck, while testing changes, I ran into a different issue though. If I perform mount to a server that does not have the nfs rdma port in portlist using mount -o rdma, And then perform rmmod qedr, it hangs with following stack : root@lb-tlvb-pcie37 ~]# cat /proc/5981/stack [<0>] rpcrdma_conn_upcall+0x26a/0x310 [rpcrdma] [<0>] cma_remove_one+0x26d/0x2a0 [rdma_cm] [<0>] ib_unregister_device+0xcc/0x180 [ib_core] [<0>] qedr_remove+0x37/0x80 [qedr] [<0>] qede_rdma_unregister_driver+0x4b/0x90 [qede] [<0>] SyS_delete_module+0x1c2/0x240 [<0>] do_syscall_64+0x6e/0x190 [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 (wait_for_completion(&ia->ri_remove_done) Thanks, Michal > > > > > > > On Mar 14, 2018, at 10:42 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> > > wrote: > > > > > > Michal Kalderon has found some corner cases around device unload > > > with active NFS mounts that I didn't have the imagination to test > > > when xprtrdma device removal was added last year. > > > > > > - The ULP device removal handler is responsible for deallocating > > > the PD. That wasn't clear to me initially, and my own testing > > > suggested it was not necessary, but that is incorrect. > > > > > > - The transport destruction path can no longer assume that there is > > > a valid ID. > > > > > > - When destroying a transport, ensure that ib_free_cq() is not > > > invoked on a CQ that was already released. > > > > > > Reported-by: Michal Kalderon <Michal.Kalderon@xxxxxxxxxx> > > > Fixes: bebd031866ca ("xprtrdma: Support unplugging an HCA from ...") > > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > --- > > > net/sunrpc/xprtrdma/verbs.c | 13 +++++++++---- > > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > > > diff --git a/net/sunrpc/xprtrdma/verbs.c > > > b/net/sunrpc/xprtrdma/verbs.c index d19ea02..c284ee7 100644 > > > --- a/net/sunrpc/xprtrdma/verbs.c > > > +++ b/net/sunrpc/xprtrdma/verbs.c > > > @@ -251,7 +251,6 @@ > > > wait_for_completion(&ia->ri_remove_done); > > > > > > ia->ri_id = NULL; > > > - ia->ri_pd = NULL; > > > ia->ri_device = NULL; > > > /* Return 1 to ensure the core destroys the id. */ > > > return 1; > > > @@ -449,7 +448,9 @@ > > > ia->ri_id->qp = NULL; > > > } > > > ib_free_cq(ep->rep_attr.recv_cq); > > > + ep->rep_attr.recv_cq = NULL; > > > ib_free_cq(ep->rep_attr.send_cq); > > > + ep->rep_attr.send_cq = NULL; > > > > > > /* The ULP is responsible for ensuring all DMA > > > * mappings and MRs are gone. > > > @@ -462,6 +463,8 @@ > > > rpcrdma_dma_unmap_regbuf(req->rl_recvbuf); > > > } > > > rpcrdma_mrs_destroy(buf); > > > + ib_dealloc_pd(ia->ri_pd); > > > + ia->ri_pd = NULL; > > > > > > /* Allow waiters to continue */ > > > complete(&ia->ri_remove_done); > > > @@ -629,14 +632,16 @@ > > > { > > > cancel_delayed_work_sync(&ep->rep_connect_worker); > > > > > > - if (ia->ri_id->qp) { > > > + if (ia->ri_id && ia->ri_id->qp) { > > > rpcrdma_ep_disconnect(ep, ia); > > > rdma_destroy_qp(ia->ri_id); > > > ia->ri_id->qp = NULL; > > > } > > > > > > - ib_free_cq(ep->rep_attr.recv_cq); > > > - ib_free_cq(ep->rep_attr.send_cq); > > > + if (ep->rep_attr.recv_cq) > > > + ib_free_cq(ep->rep_attr.recv_cq); > > > + if (ep->rep_attr.send_cq) > > > + ib_free_cq(ep->rep_attr.send_cq); > > > } > > > > > > /* Re-establish a connection after a device removal event. > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" > > > in the body of a message to majordomo@xxxxxxxxxxxxxxx More > > majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Chuck Lever > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html