Re: [PATCH for-next v6 1/8] RDMA/rxe: Replace RB tree by xarray for indexes

Bob Pearson <rpearsonhpe@xxxxxxxxx> · Wed, 8 Dec 2021 18:16:21 -0600

On 12/7/21 13:09, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 03:12:36PM -0600, Bob Pearson wrote:
>>  	if (pool->flags & RXE_POOL_INDEX) {
>> -		pool->index.tree = RB_ROOT;
>> -		err = rxe_pool_init_index(pool, info->max_index,
>> -					  info->min_index);
>> -		if (err)
>> -			goto out;
>> +		xa_init_flags(&pool->xarray.xa, XA_FLAGS_ALLOC);
>> +		pool->xarray.limit.max = info->max_index;
>> +		pool->xarray.limit.min = info->min_index;
>> +	} else {
>> +		/* if pool not indexed just use xa spin_lock */
>> +		spin_lock_init(&pool->xarray.xa.xa_lock);
> 
> xarray's don't cost anything to init, so there is no reason to do
> something like this.
OK
> 
>> +/* drop a reference to an object */
>> +static inline bool __rxe_drop_ref(struct rxe_pool_elem *elem)
>> +{
>> +	bool ret;
>> +
>> +	rxe_pool_lock_bh(elem->pool);
>> +	ret = kref_put(&elem->ref_cnt, rxe_elem_release);
>> +	rxe_pool_unlock_bh(elem->pool);
> 
> This is a bit strange, why does something need to hold a lock around a
> kref?

This also relates to your comment on 8/8 patch. There seems to be a race opportunity
between the call to kref_put(&obj->elem, rxe_elem_release) and the call in
rxe_elem_release() to xa_erase(). If a duplicate or delayed packet arrives after the the
final kref_put() and before the xa_erase() one can still lookup the object from its index
(qpn, rkey, etc.) and take a reference to it. The use of kref_get_unless_zero and
locking around kref_put and __xa_erase was an attempt to fix this. Once you call kref_put
with the ref count going to zero there is no way to prevent the object getting its
cleanup routine called.

Bob

> Jason
>