Re: [PATCH] rdma/core: add __module_get()/module_put() to cma_[de]ref_dev()

Jason Gunthorpe <jgg@xxxxxxxx> · Tue, 1 Oct 2019 21:57:50 -0300



On Wed, Oct 02, 2019 at 12:03:14AM +0200, Stefan Metzmacher wrote:
> >>> Globally blocking module unload would break the existing dis-associate
> >>> flows, and blocking until listeners are removed seems like all rdma
> >>> drivers will instantly become permanetly blocked when things like SRP
> >>> or IPoIB CM mode are running?
> >>
> >> So the design is to allow drivers to be unloaded while there are
> >> active connections?
> >>
> >> If so is this specific to RDMA drivers?
> > 
> > No, it is normal for networking, you can ip link set down and unload a
> > net driver even though there are sockets open that might traverse it
> 
> Ok.
> 
> >>> I think the proper thing is to fix rxe (and probably siw) to signal
> >>> the DEVICE_FATAL so the CMA listeners can cleanly disconnect
> >>
> >> I just found that drivers/nvme/host/rdma.c and
> >> drivers/nvme/target/rdma.c both use ib_register_client();
> >> in order to get notified that a device is going to be removed.
> >>
> >> Maybe I should also use ib_register_client()?
> > 
> > Oh, yes, all kernel clients must use register_client and related to
> > manage their connection to the RDMA stack otherwise they are probably
> > racy. The remove callback there is the same idea as the device_fatal
> > scheme is for userspace.
> 
> Ok, thanks! I'll take a look at it.
> 
> > How do you discover the RDMA device to use? Just call into CM and let
> > it sort it out? That actually seems reasonable, but then CM should
> > take care of the remove() to kill connections, I suppose it doesn't..
> 
> On the client:
> 
> rdma_create_id()
> rdma_resolve_addr()
> rdma_resolve_route()
> rdma_connect()
> 
> On the server:
> rdma_create_id()
> rdma_bind_addr()
> rdma_listen()
> rdma_accept()
> 
> I just pass in an ipv4 or ipv6 addresses.

Ah, the only working flow today is to destroy your IDs when a remove
comes, and CM IDs become linked to a single rdma device so you know
which IDs to tear down on destroy

Jason