RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

Devesh Sharma <Devesh.Sharma@xxxxxxxxxx> · Mon, 21 Jul 2014 05:40:56 +0000

> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Friday, July 18, 2014 8:57 PM
> To: Devesh Sharma
> Cc: Steve Wise; Hefty, Sean; Shirley Ma; Roland Dreier; linux-
> rdma@xxxxxxxxxxxxxxx
> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider
> module
> 
> 
> On Jul 18, 2014, at 2:19 AM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx>
> wrote:
> 
> >> -----Original Message-----
> >> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
> >> owner@xxxxxxxxxxxxxxx] On Behalf Of Steve Wise
> >> Sent: Friday, July 18, 2014 1:39 AM
> >> To: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier'
> >> Cc: linux-rdma@xxxxxxxxxxxxxxx; chuck.lever@xxxxxxxxxx
> >> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider
> >> module
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
> >>> Sent: Thursday, July 17, 2014 2:56 PM
> >>> To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
> >>> Cc: 'linux-rdma@xxxxxxxxxxxxxxx'; 'chuck.lever@xxxxxxxxxx'
> >>> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
> >>> provider module
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Hefty, Sean [mailto:sean.hefty@xxxxxxxxx]
> >>>> Sent: Thursday, July 17, 2014 2:50 PM
> >>>> To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
> >>>> Cc: linux-rdma@xxxxxxxxxxxxxxx; chuck.lever@xxxxxxxxxx
> >>>> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma
> >>>> provider module
> >>>>
> >>>>>> So the rdma cm is expected to increase the driver reference count
> >>>>> (try_module_get) for
> >>>>>> each new cm id, then deference count (module_put) when cm id is
> >>>>> destroyed?
> >>>>>>
> >>>>>
> >>>>> No, I think he's saying the rdma-cm posts a
> >> RDMA_CM_DEVICE_REMOVAL
> >>>>> event to each application with rdmacm objects allocated, and each
> >>>>> application is expected to destroy all the objects it has
> >>>>> allocated before returning from the event handler.
> >>>>
> >>>> This is almost correct.  The applications do not have to destroy
> >>>> all the objects that
> >> it has
> >>>> allocated before returning from their event handler.  E.g. an app
> >>>> can queue a work
> >> item
> >>>> that does the destruction.  The rdmacm will block in its ib_client
> >>>> remove handler
> >> until all
> >>>> relevant rdma_cm_id's have been destroyed.
> >>>>
> >>>
> >>> Thanks for the clarification.
> >>>
> >>
> >> And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in
> >> rpcrdma_conn_upcall().
> >> It sets ep->rep_connected to -ENODEV, wakes everybody up, and calls
> >> rpcrdma_conn_func() for that endpoint, which schedules
> >> rep_connect_worker...  and I gave up following the code path at this
> point...
> >> :)
> >>
> >> For this to all work correctly, it would need to destroy all the QPs,
> >> MRs, CQs, etc for that device _before_ destroying the rdma cm ids.
> >> Otherwise the provider module could be unloaded too soon...
> >
> > Okay, Should I try to handle device removal in this proposed fashion and
> post the v1.
> 
> Hi Devesh,
> 
> To make it work, xprtrdma is going to have to allow the device to be removed
> and added back while there are active NFS mounts and pending RPCs.
> AFAICT the code is not structured to do that today.
> 
> Probably the place to start is to see how much work is needed to leverage
> the existing logic to watch for ENODEV and do the right things to suspend
> RPC activity until another device is inserted. It would have to work like a
> network partition that causes a transport reconnect.
> 
> However, replacing everything, including all MRs and the PD, will require
> significant code churn and additional (undesirable) serialization around the
> use of QPs and cm_ids. Thus I would like to understand how much of a
> priority this is.

Sure, I got your point, this is good for long term stability that we understand
and do the right thing. However if it's not feasible to change the entire infrastructure
in one go OR in the near future, do we have other ideas/options to handle this problem

In the real world situations it's quite possible that someone attempts to reload/replace/unload the 
vendor driver without knowing if there is an active mount or not.

> 
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html