On 12/7/21 13:09, Jason Gunthorpe wrote: > On Mon, Dec 06, 2021 at 03:12:36PM -0600, Bob Pearson wrote: >> if (pool->flags & RXE_POOL_INDEX) { >> - pool->index.tree = RB_ROOT; >> - err = rxe_pool_init_index(pool, info->max_index, >> - info->min_index); >> - if (err) >> - goto out; >> + xa_init_flags(&pool->xarray.xa, XA_FLAGS_ALLOC); >> + pool->xarray.limit.max = info->max_index; >> + pool->xarray.limit.min = info->min_index; >> + } else { >> + /* if pool not indexed just use xa spin_lock */ >> + spin_lock_init(&pool->xarray.xa.xa_lock); > > xarray's don't cost anything to init, so there is no reason to do > something like this. OK > >> +/* drop a reference to an object */ >> +static inline bool __rxe_drop_ref(struct rxe_pool_elem *elem) >> +{ >> + bool ret; >> + >> + rxe_pool_lock_bh(elem->pool); >> + ret = kref_put(&elem->ref_cnt, rxe_elem_release); >> + rxe_pool_unlock_bh(elem->pool); > > This is a bit strange, why does something need to hold a lock around a > kref? This also relates to your comment on 8/8 patch. There seems to be a race opportunity between the call to kref_put(&obj->elem, rxe_elem_release) and the call in rxe_elem_release() to xa_erase(). If a duplicate or delayed packet arrives after the the final kref_put() and before the xa_erase() one can still lookup the object from its index (qpn, rkey, etc.) and take a reference to it. The use of kref_get_unless_zero and locking around kref_put and __xa_erase was an attempt to fix this. Once you call kref_put with the ref count going to zero there is no way to prevent the object getting its cleanup routine called. Bob > Jason >