On Sun, Jan 19, 2025 at 10:21:41AM +0200, Leon Romanovsky wrote: > Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy") Cc: stable Fixes a user triggerable oops > - if (!refcount_inc_not_zero(&imr->mmkey.usecount)) > + xa_lock(&imr->implicit_children); > + if (__xa_cmpxchg(&imr->implicit_children, idx, mr, NULL, GFP_KERNEL) != > + mr) { > + xa_unlock(&imr->implicit_children); > return; > + } > > - xa_erase(&imr->implicit_children, idx); > if (MLX5_CAP_ODP(mr_to_mdev(mr)->mdev, mem_page_fault)) > - xa_erase(&mr_to_mdev(mr)->odp_mkeys, > - mlx5_base_mkey(mr->mmkey.key)); > + __xa_erase(&mr_to_mdev(mr)->odp_mkeys, > + mlx5_base_mkey(mr->mmkey.key)); > + xa_unlock(&imr->implicit_children); > + > + if (!refcount_inc_not_zero(&imr->mmkey.usecount)) > + return; It seems the refcount must be done first: /* * If userspace is racing freeing the parent implicit ODP MR * then we can loose the race with parent destruction. In this * case mlx5_ib_free_odp_mr() will free everything in the * implicit_children xarray so NOP is fine. This child MR * cannot be destroyed here because we are under its umem_mutex. */ if (!refcount_inc_not_zero(&imr->mmkey.usecount)) return; What we must not do is remove something from the xarray and then fail to free it. Jason