Re: [PATCH] rdma/core: add __module_get()/module_put() to cma_[de]ref_dev()

Stefan Metzmacher <metze@xxxxxxxxx> · Wed, 2 Oct 2019 00:03:14 +0200

>>> Globally blocking module unload would break the existing dis-associate
>>> flows, and blocking until listeners are removed seems like all rdma
>>> drivers will instantly become permanetly blocked when things like SRP
>>> or IPoIB CM mode are running?
>>
>> So the design is to allow drivers to be unloaded while there are
>> active connections?
>>
>> If so is this specific to RDMA drivers?
> 
> No, it is normal for networking, you can ip link set down and unload a
> net driver even though there are sockets open that might traverse it

Ok.

>>> I think the proper thing is to fix rxe (and probably siw) to signal
>>> the DEVICE_FATAL so the CMA listeners can cleanly disconnect
>>
>> I just found that drivers/nvme/host/rdma.c and
>> drivers/nvme/target/rdma.c both use ib_register_client();
>> in order to get notified that a device is going to be removed.
>>
>> Maybe I should also use ib_register_client()?
> 
> Oh, yes, all kernel clients must use register_client and related to
> manage their connection to the RDMA stack otherwise they are probably
> racy. The remove callback there is the same idea as the device_fatal
> scheme is for userspace.

Ok, thanks! I'll take a look at it.

> How do you discover the RDMA device to use? Just call into CM and let
> it sort it out? That actually seems reasonable, but then CM should
> take care of the remove() to kill connections, I suppose it doesn't..

On the client:

rdma_create_id()
rdma_resolve_addr()
rdma_resolve_route()
rdma_connect()

On the server:
rdma_create_id()
rdma_bind_addr()
rdma_listen()
rdma_accept()

I just pass in an ipv4 or ipv6 addresses.

metze

Attachment:
signature.asc

Description: OpenPGP digital signature