On Wed, Oct 02, 2019 at 12:03:14AM +0200, Stefan Metzmacher wrote: > >>> Globally blocking module unload would break the existing dis-associate > >>> flows, and blocking until listeners are removed seems like all rdma > >>> drivers will instantly become permanetly blocked when things like SRP > >>> or IPoIB CM mode are running? > >> > >> So the design is to allow drivers to be unloaded while there are > >> active connections? > >> > >> If so is this specific to RDMA drivers? > > > > No, it is normal for networking, you can ip link set down and unload a > > net driver even though there are sockets open that might traverse it > > Ok. > > >>> I think the proper thing is to fix rxe (and probably siw) to signal > >>> the DEVICE_FATAL so the CMA listeners can cleanly disconnect > >> > >> I just found that drivers/nvme/host/rdma.c and > >> drivers/nvme/target/rdma.c both use ib_register_client(); > >> in order to get notified that a device is going to be removed. > >> > >> Maybe I should also use ib_register_client()? > > > > Oh, yes, all kernel clients must use register_client and related to > > manage their connection to the RDMA stack otherwise they are probably > > racy. The remove callback there is the same idea as the device_fatal > > scheme is for userspace. > > Ok, thanks! I'll take a look at it. > > > How do you discover the RDMA device to use? Just call into CM and let > > it sort it out? That actually seems reasonable, but then CM should > > take care of the remove() to kill connections, I suppose it doesn't.. > > On the client: > > rdma_create_id() > rdma_resolve_addr() > rdma_resolve_route() > rdma_connect() > > On the server: > rdma_create_id() > rdma_bind_addr() > rdma_listen() > rdma_accept() > > I just pass in an ipv4 or ipv6 addresses. Ah, the only working flow today is to destroy your IDs when a remove comes, and CM IDs become linked to a single rdma device so you know which IDs to tear down on destroy Jason