Re: [PATCH for-next v14 08/10] RDMA/rxe: Stop lookup of partially built objects

Jason Gunthorpe <jgg@xxxxxxxxxx> · Mon, 9 May 2022 09:21:18 -0300

On Wed, Apr 20, 2022 at 08:40:41PM -0500, Bob Pearson wrote:
> Currently the rdma_rxe driver has a security weakness due to giving
> objects which are partially initialized indices allowing external
> actors to gain access to them by sending packets which refer to
> their index (e.g. qpn, rkey, etc) causing unpredictable results.
> 
> This patch adds a new API rxe_finalize(obj) which enables looking up
> pool objects from indices using rxe_pool_get_index() for
> AH, QP, MR, and MW. They are added in create verbs only after the
> objects are fully initialized.
> 
> It also adds wait for completion to destroy/dealloc verbs to assure that
> all references have been dropped before returning to rdma_core by
> implementing a new rxe_pool API rxe_cleanup() which drops a reference
> to the object and then waits for all other references to be dropped.
> When the last reference is dropped the object is completed by kref.
> After that it cleans up the object and if locally allocated frees
> the memory.
> 
> Combined with deferring cleanup code to type specific cleanup
> routines this allows all pending activity referring to objects to
> complete before returning to rdma_core.

This seems fine, except the AH problem needs to be fixed before
applying this patch since it looks like it worsens it on the destoy
side now too.

> +int __rxe_cleanup(struct rxe_pool_elem *elem)
> +{
>  	struct rxe_pool *pool = elem->pool;
> +	struct xarray *xa = &pool->xa;
> +	static int timeout = RXE_POOL_TIMEOUT;
> +	unsigned long flags;
> +	int ret, err = 0;
> +	void *xa_ret;
>  
> -	xa_erase(&pool->xa, elem->index);
> +	/* erase xarray entry to prevent looking up
> +	 * the pool elem from its index
> +	 */
> +	xa_lock_irqsave(xa, flags);
> +	xa_ret = __xa_erase(xa, elem->index);
> +	xa_unlock_irqrestore(xa, flags);
> +	WARN_ON(xa_err(xa_ret));
> +
> +	/* if this is the last call to rxe_put complete the
> +	 * object. It is safe to touch elem after this since
> +	 * it is freed below
> +	 */
> +	__rxe_put(elem);
> +
> +	if (timeout) {
> +		ret = wait_for_completion_timeout(&elem->complete, timeout);

The atomic version of this should just spin, I guess, but is there
even any condition where an AH can be held? Maybe the atomic version
just returns succees..

> +		if (!ret) {
> +			pr_warn("Timed out waiting for %s#%d to complete\n",
> +				pool->name, elem->index);
> +			if (++pool->timeouts >= RXE_POOL_MAX_TIMEOUTS)
> +				timeout = 0;
> +
> +			err = -EINVAL;

This is also something of a bug, the xa was erased but now we have
this zombie element floating around, so when we call this again (and
the core code will call it again) the xa will be corrupted.

Could mitigate this by using xa_cmpxchng instead of erase and ignore
the return code.

Jason