On 7/31/23 13:31, Jason Gunthorpe wrote: > On Mon, Jul 31, 2023 at 01:23:59PM -0500, Bob Pearson wrote: >> On 7/31/23 13:15, Jason Gunthorpe wrote: >>> On Fri, Jul 21, 2023 at 03:50:21PM -0500, Bob Pearson wrote: >>>> This patch gives a more detailed list of objects that are not >>>> freed in case of error before the module exits. >>>> >>>> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> >>>> --- >>>> drivers/infiniband/sw/rxe/rxe_pool.c | 12 +++++++++++- >>>> 1 file changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c >>>> index cb812bd969c6..3249c2741491 100644 >>>> --- a/drivers/infiniband/sw/rxe/rxe_pool.c >>>> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c >>>> @@ -113,7 +113,17 @@ void rxe_pool_init(struct rxe_dev *rxe, struct rxe_pool *pool, >>>> >>>> void rxe_pool_cleanup(struct rxe_pool *pool) >>>> { >>>> - WARN_ON(!xa_empty(&pool->xa)); >>>> + unsigned long index; >>>> + struct rxe_pool_elem *elem; >>>> + >>>> + xa_lock(&pool->xa); >>>> + xa_for_each(&pool->xa, index, elem) { >>>> + rxe_err_dev(pool->rxe, "%s#%d: Leaked", pool->name, >>>> + elem->index); >>>> + } >>>> + xa_unlock(&pool->xa); >>>> + >>>> + xa_destroy(&pool->xa); >>>> } >>> >>> Is this why? Just count the number of unfinalized objects and report >>> if there is difference, don't mess up the xarray. >>> >>> Jason >> This is why I made the last change but I really didn't like that there was no >> way to lookup the qp from its index since we were using a NULL xarray entry to >> indicate the state of the qp. Making it explicit, i.e. a variable in the struct >> seems much more straight forward. Not sure why you hated the last >> change. > > Because if you don't call finalize abort has to be deterministic, and > abort can't be that if it someone else can access the poitner and, eg, > take a reference. rxe_pool_get_index() is the only 'correct' way to look up the pointer and it checks the valid state (now). Scanning the xarray or just looking up the qp is something outside the scope of the normal flows. Like listing orphan objects on module exit. Memory ownership didn't change. It is still the same. The only change is how we mark whether the object is valid for lookup. Bob > > It breaks the entire model of how the memory ownership works during creation. > > Jason