RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

"Steve Wise" <swise@xxxxxxxxxxxxxxxxxxxxx> · Mon, 21 Jul 2014 10:03:45 -0500

> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Monday, July 21, 2014 9:54 AM
> To: Devesh Sharma
> Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; linux-rdma@xxxxxxxxxxxxxxx
> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
> 
> Hi Devesh-
> 
> Thanks for drilling into this further.
> 
> On Jul 21, 2014, at 7:48 AM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx> wrote:
> 
> > In rpcrdma_ep_connect():
> >
> > write_lock(&ia->ri_qplock);
> >                old = ia->ri_id;
> >                ia->ri_id = id;
> >                write_unlock(&ia->ri_qplock);
> >
> >                rdma_destroy_qp(old);
> >                rdma_destroy_id(old);  =============> Cm -id is destroyed here.
> >
> >
> > If following code fails in rpcrdma_ep_connect():
> > id = rpcrdma_create_id(xprt, ia,
> >                                (struct sockaddr *)&xprt->rx_data.addr);
> >                if (IS_ERR(id)) {
> >                        rc = -EHOSTUNREACH;
> >                        goto out;
> >                }
> >
> > it leaves old cm-id still alive. This will always fail if Device is removed abruptly.
> 
> For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected
> to -ENODEV.
> 
> Then:
> 
>  929 int
>  930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
>  931 {
>  932         struct rdma_cm_id *id, *old;
>  933         int rc = 0;
>  934         int retry_count = 0;
>  935
>  936         if (ep->rep_connected != 0) {
>  937                 struct rpcrdma_xprt *xprt;
>  938 retry:
>  939                 dprintk("RPC:       %s: reconnecting...\n", __func__);
> 
> ep->rep_connected is probably -ENODEV after a device removal. It would be
> possible for the connect worker to destroy everything associated with this
> connection in that case to ensure the underlying object reference counts
> are cleared.
> 
> The immediate danger is that if there are pending RPCs, they could exit while
> qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external().
> Checking for NULL pointers inside the ri_qplock would prevent that.
> 
> However, NFS mounts via this adapter will hang indefinitely after all
> transports are torn down and the adapter is gone. The only thing that can be
> done is something drastic like "echo b > /proc/sysrq_trigger" on the client.
> 
> Thus, IMO hot-plugging or passive fail-over are the only scenarios where
> this makes sense. If we have an immediate problem here, is it a problem with
> system shutdown ordering that can be addressed in some other way?
> 
> Until that support is in place, obviously I would prefer that the removal of
> the underlying driver be prevented while there are NFS mounts in place. I
> think that's what NFS users have come to expect.
> 
> In other words, don't allow device removal until we have support for device
> insertion :-)
> 
> 

If we fix the above problems on provider unload, shouldn't the mount recover if the
provider module is subsequently loaded?  Or another provider configured such that
rdma_resolve_addr/route() then picks an active device?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html