RE: [for-next 1/2] xprtrdma: take reference of rdma provider module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Shirley Ma [mailto:shirley.ma@xxxxxxxxxx]
> Sent: Thursday, July 17, 2014 1:58 PM
> To: Hefty, Sean; Steve Wise; Devesh Sharma; Roland Dreier
> Cc: linux-rdma@xxxxxxxxxxxxxxx; chuck.lever@xxxxxxxxxx
> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
> 
> 
> 
> On 07/17/2014 09:06 AM, Hefty, Sean wrote:
> >> On 7/17/2014 9:01 AM, Devesh Sharma wrote:
> >>> If verndor driver is attempted for removal while xprtrdma still has an
> >>> active mount, the removal of driver may never complete and can cause
> >>> unseen races or in worst case system crash.
> >>>
> >>> To solve this, xprtrdma module should get reference of struct ib_device
> >>> structure for every mount. Reference is taken after local device address
> >>> resolution is completed successfuly.
> >>>
> >>> reference to the struct ib_device pointer is put just before cm_id
> >> destruction.
> >>>
> >>> Signed-off-by: Devesh Sharma <devesh.sharma@xxxxxxxxxx>
> >>
> >> This seems like an issue with the rdma-cm or rdma core, not xprtrdma.  I
> >> see that user rdma applications cause a ref on the provider module here
> >> in ib_uverbs_open():
> >>
> >>          if (!try_module_get(dev->ib_dev->owner)) {
> >>                  ret = -ENODEV;
> >>                  goto err;
> >>
> >>
> >> Maybe kernel applications that allocate device resources should cause a
> >> ref on the provider's module.
> >>
> >> Sean/Roland,  is there some history here as to how rdma provider module
> >> removal should be handled?
> >
> > The kernel modules should are not expected to access the rdma devices after their
> remove device callback has been invoked.  The rdma cm basically forwards the device
> removal on a per id basis.  Apps are expected to destroy the id after receiving that
callback.
> The rdma cm should block in the remove device call until all id's associated with the
> removed device have been destroyed.
> 
> So the rdma cm is expected to increase the driver reference count (try_module_get) for
> each new cm id, then deference count (module_put) when cm id is destroyed?
> 

No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event  to each
application with rdmacm objects allocated, and each application is expected to destroy all
the objects it has allocated before returning from the event handler.

And I think the ib_verbs core calls each ib_client's remove handler when an rdma provider
unregisters with the core.  

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux