On Wed, Sep 04, 2024 at 11:31:13AM -0300, Jason Gunthorpe wrote: > On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky <leonro@xxxxxxxxxx> > > > > Failure in driver initialization can lead to a situation where the GID > > entries are set but not used yet. In this case, the kref will be equal to 1, > > which will trigger a false positive leak detection. > > Why does that happen?? > > > > For example, these messages are printed during the driver initialization > > and followed by release_gid_table() call: > > > > infiniband syz1: ib_query_port failed (-19) > > infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache > > Okay, but who set the ref=1? > > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c > > index b7c078b7f7cf..c6aec2e04d4c 100644 > > --- a/drivers/infiniband/core/cache.c > > +++ b/drivers/infiniband/core/cache.c > > @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device, > > return; > > > > for (i = 0; i < table->sz; i++) { > > + int gid_kref; > > + > > if (is_gid_entry_free(table->data_vec[i])) > > continue; > > > > - WARN_ONCE(true, > > + gid_kref = kref_read(&table->data_vec[i]->kref); > > + WARN_ONCE(gid_kref > 1, > > "GID entry ref leak for dev %s index %d ref=%u\n", > > - dev_name(&device->dev), i, > > - kref_read(&table->data_vec[i]->kref)); > > + dev_name(&device->dev), i, gid_kref); > > } > > I'm not convinced, I think the bug here is something wrong on the > refcounting side not the freeing side. Ref should not be 1. Seems like > missing error unwinding in the init side. I dropped this patch as the real fix is here 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup") Thanks > > Jason >