On Mon, Jul 22, 2019 at 02:21:18PM +0300, Gal Pressman wrote: > On 22/07/2019 12:15, Leon Romanovsky wrote: > > On Mon, Jul 22, 2019 at 11:10:51AM +0300, Gal Pressman wrote: > >> Hi all, > >> > >> I'm seeing memory leaks when running tests with valgrind memcheck tool [1]. It > >> seems like it's caused due to verbs_device refcount never reaching zero. > >> > >> Last related commit is 8125fdeb69bb ("verbs: Avoid ibv_device memory leak"), > >> which seems like it should prevent this issue - but I'm not sure it covers all > >> cases. > >> > >> When calling ibv_get_device_list, try_driver will eventually get called and set > >> the device refcount to one. The refcount for each device will be increased when > >> iterating the devices list, and on each verbs_init_context call. > >> > >> In the free flow, the refcount is decreased on verbs_uninit_context and when > >> iterating the devices list - which brings the refcount back to one, as initially > >> set by try_driver (hence uninit_device isn't called). > >> > >> Is there a reason for initializing refcount to one instead of zero? According to > >> the cited commit the reference count should be decreased when the device no > >> longer exists in the sysfs, but the device isn't necessarily removed from the sysfs. > > > > Such scheme allows us to avoid rdma-core provider reinitialization every > > time application "plays" with ibv_get_device_list(). Anyway, the rdma-core > > library (libibverbs) won't be unloaded till dclose() is called and glibc > > reference count won't reach zero, so we don't need to release provider > > till that point of time too. > > So you consider these valgrind errors false alarms? Yes, valgrind checks executed code and unlikely to check unload sequence. In your case, the unload code wasn't called at all. Thanks