Hi Devesh- Thanks for drilling into this further. On Jul 21, 2014, at 7:48 AM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx> wrote: > In rpcrdma_ep_connect(): > > write_lock(&ia->ri_qplock); > old = ia->ri_id; > ia->ri_id = id; > write_unlock(&ia->ri_qplock); > > rdma_destroy_qp(old); > rdma_destroy_id(old); =============> Cm -id is destroyed here. > > > If following code fails in rpcrdma_ep_connect(): > id = rpcrdma_create_id(xprt, ia, > (struct sockaddr *)&xprt->rx_data.addr); > if (IS_ERR(id)) { > rc = -EHOSTUNREACH; > goto out; > } > > it leaves old cm-id still alive. This will always fail if Device is removed abruptly. For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected to -ENODEV. Then: 929 int 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) 931 { 932 struct rdma_cm_id *id, *old; 933 int rc = 0; 934 int retry_count = 0; 935 936 if (ep->rep_connected != 0) { 937 struct rpcrdma_xprt *xprt; 938 retry: 939 dprintk("RPC: %s: reconnecting...\n", __func__); ep->rep_connected is probably -ENODEV after a device removal. It would be possible for the connect worker to destroy everything associated with this connection in that case to ensure the underlying object reference counts are cleared. The immediate danger is that if there are pending RPCs, they could exit while qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external(). Checking for NULL pointers inside the ri_qplock would prevent that. However, NFS mounts via this adapter will hang indefinitely after all transports are torn down and the adapter is gone. The only thing that can be done is something drastic like “echo b > /proc/sysrq_trigger” on the client. Thus, IMO hot-plugging or passive fail-over are the only scenarios where this makes sense. If we have an immediate problem here, is it a problem with system shutdown ordering that can be addressed in some other way? Until that support is in place, obviously I would prefer that the removal of the underlying driver be prevented while there are NFS mounts in place. I think that’s what NFS users have come to expect. In other words, don’t allow device removal until we have support for device insertion :-) > In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced here (cma.c): > > static int cma_acquire_dev(struct rdma_id_private *id_priv, > struct rdma_id_private *listen_id_priv) { > . > . > if (!ret) > cma_attach_to_dev(id_priv, cma_dev); > } > > static void cma_release_dev(struct rdma_id_private *id_priv) > { > mutex_lock(&lock); > list_del(&id_priv->list); > cma_deref_dev(id_priv->cma_dev); > . > . > } > > Since as per design of nfs-rdma at-least previously known good cm-id always remains live utill > another good cm-id is created, cma_dev->refcount never becomes 0 upon device removal . > Thus blocking the rmmod <vendor driver> forever. > > -Regards > Devesh > >> -----Original Message----- >> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma- >> owner@xxxxxxxxxxxxxxx] On Behalf Of Devesh Sharma >> Sent: Monday, July 21, 2014 11:42 AM >> To: Shirley Ma; Steve Wise; 'Chuck Lever' >> Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@xxxxxxxxxxxxxxx >> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider >> module >> >> Shirley, >> >> Once rmmod is issued, the connection corresponding to the active mount is >> destroyed and all the associated resources Are freed. As per the processing >> logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the waiters, This >> results into Re-establishment efforts, since the device is not present any >> more, rdma_resolve_address() fails with CM resolution Error. This loop >> continues forever. >> >> I am yet to find out which part of ocrdma is blocked. I am putting some debug >> messages to find it out. I will get back to The group with an update. >> >> -Regards >> Devesh >> >>> -----Original Message----- >>> From: Shirley Ma [mailto:shirley.ma@xxxxxxxxxx] >>> Sent: Friday, July 18, 2014 9:18 PM >>> To: Steve Wise; Devesh Sharma; 'Chuck Lever' >>> Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@xxxxxxxxxxxxxxx >>> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider >>> module >>> >>> >>> On 07/18/2014 06:27 AM, Steve Wise wrote: >>>>>>> We can't really deal with a CM_DEVICE_REMOVE event while there >>>>>>> are active NFS mounts. >>>>>>> >>>>>>> System shutdown ordering should guarantee (one would hope) that >>> NFS >>>>>>> mount points are unmounted before the RDMA/IB core >> infrastructure >>>>>>> is torn down. Ordering shouldn't matter as long all NFS activity >>>>>>> has ceased before the CM tries to remove the device. >>>>>>> >>>>>>> So if something is hanging up the CM, there's something xprtrdma >>>>>>> is not cleaning up properly. >>>>>>> >>>>>> >>>>>> >>>>>> Devesh, how are you reproducing this? Are you just rmmod'ing the >>>>>> ocrdma module while there are active mounts? >>>>> >>>>> Yes, I am issuing rmmod while there is an active mount. In my case >>>>> rmmod ocrdma remains blocked forever. >>> Where is it blocked? >>> >>>>> Off-the-course of this discussion: Is there a reasoning behind not >>>>> using >>>>> ib_register_client()/ib_unregister_client() framework? >>>> >>>> I think the idea is that you don't need to use it if you are >>>> transport-independent and use the rdmacm... >>>> >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More >>> majordomo >>>> info at http://vger.kernel.org/majordomo-info.html >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the >> body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at >> http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html