On Mon, Jul 31, 2023 at 01:51:59PM -0500, Bob Pearson wrote: > On 7/31/23 13:43, Jason Gunthorpe wrote: > > On Mon, Jul 31, 2023 at 01:42:09PM -0500, Bob Pearson wrote: > >> On 7/31/23 13:31, Jason Gunthorpe wrote: > >>> On Mon, Jul 31, 2023 at 01:23:59PM -0500, Bob Pearson wrote: > >>>> On 7/31/23 13:15, Jason Gunthorpe wrote: > >>>>> On Fri, Jul 21, 2023 at 03:50:21PM -0500, Bob Pearson wrote: > >>>>>> This patch gives a more detailed list of objects that are not > >>>>>> freed in case of error before the module exits. > >>>>>> > >>>>>> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > >>>>>> --- > >>>>>> drivers/infiniband/sw/rxe/rxe_pool.c | 12 +++++++++++- > >>>>>> 1 file changed, 11 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c > >>>>>> index cb812bd969c6..3249c2741491 100644 > >>>>>> --- a/drivers/infiniband/sw/rxe/rxe_pool.c > >>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c > >>>>>> @@ -113,7 +113,17 @@ void rxe_pool_init(struct rxe_dev *rxe, struct rxe_pool *pool, > >>>>>> > >>>>>> void rxe_pool_cleanup(struct rxe_pool *pool) > >>>>>> { > >>>>>> - WARN_ON(!xa_empty(&pool->xa)); > >>>>>> + unsigned long index; > >>>>>> + struct rxe_pool_elem *elem; > >>>>>> + > >>>>>> + xa_lock(&pool->xa); > >>>>>> + xa_for_each(&pool->xa, index, elem) { > >>>>>> + rxe_err_dev(pool->rxe, "%s#%d: Leaked", pool->name, > >>>>>> + elem->index); > >>>>>> + } > >>>>>> + xa_unlock(&pool->xa); > >>>>>> + > >>>>>> + xa_destroy(&pool->xa); > >>>>>> } > >>>>> > >>>>> Is this why? Just count the number of unfinalized objects and report > >>>>> if there is difference, don't mess up the xarray. > >>>>> > >>>>> Jason > >>>> This is why I made the last change but I really didn't like that there was no > >>>> way to lookup the qp from its index since we were using a NULL xarray entry to > >>>> indicate the state of the qp. Making it explicit, i.e. a variable in the struct > >>>> seems much more straight forward. Not sure why you hated the last > >>>> change. > >>> > >>> Because if you don't call finalize abort has to be deterministic, and > >>> abort can't be that if it someone else can access the poitner and, eg, > >>> take a reference. > >> > >> rxe_pool_get_index() is the only 'correct' way to look up the pointer and > >> it checks the valid state (now). Scanning the xarray or just looking up > >> the qp is something outside the scope of the normal flows. Like listing > >> orphan objects on module exit. > > > > Maybe at this instance, but people keep changing this and it is > > fundamentally wrong to store a pointer to an incompletely initialized > > memory for other threads to see. > > > > Especially for some minor debugging feature. > > Maybe warning comments would help. There are lots of ways to break the code, sigh. > The typical one is someone makes a change that breaks reference > counting. Don't do it wrong in the first place is the most important thing. Don't put incompletely intialized objects in global memory. It is a very simple rule. Jason