On Mon, Jul 22, 2019 at 11:10:51AM +0300, Gal Pressman wrote: > Hi all, > > I'm seeing memory leaks when running tests with valgrind memcheck tool [1]. It > seems like it's caused due to verbs_device refcount never reaching zero. > > Last related commit is 8125fdeb69bb ("verbs: Avoid ibv_device memory leak"), > which seems like it should prevent this issue - but I'm not sure it covers all > cases. > > When calling ibv_get_device_list, try_driver will eventually get called and set > the device refcount to one. The refcount for each device will be increased when > iterating the devices list, and on each verbs_init_context call. > > In the free flow, the refcount is decreased on verbs_uninit_context and when > iterating the devices list - which brings the refcount back to one, as initially > set by try_driver (hence uninit_device isn't called). > > Is there a reason for initializing refcount to one instead of zero? According to > the cited commit the reference count should be decreased when the device no > longer exists in the sysfs, but the device isn't necessarily removed from the sysfs. Such scheme allows us to avoid rdma-core provider reinitialization every time application "plays" with ibv_get_device_list(). Anyway, the rdma-core library (libibverbs) won't be unloaded till dclose() is called and glibc reference count won't reach zero, so we don't need to release provider till that point of time too. Thanks