There are several race conditions discovered in the current rdma_rxe driver. They mostly relate to races between normal operations and destroying objects. This patch series - Makes several minor cleanups in rxe_pool.[ch] - Replaces the red-black trees currently used by xarrays for indices - Simplifies the API for keyed objects - Corrects several reference counting errors - Adds wait for completions to the paths in verbs APIs which destroy objects. The patch series has been changed to RFC PATCH instead of PATCH for-next because I have little experience with rcu locking and would like someone else to review this code (in 18/26 and 24/26). RCU locking should improve performance at large scale but this has not been tested yet. This patch series applies cleanly to current for-next. commit e783362eb54cd99b2cac8b3a9aeac942e6f6ac07 (tag: v5.17-rc1, origin/wip/jgg-for-rc, origin/wip/jgg-for-next, origin/wip/for-testing, origin/for-rc, origin/for-next, origin/HEAD, for-next) Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> --- v9 Corrected issues reported by Jason Gunthorpe, Converted locking in rxe_mcast.c and rxe_pool.c to use RCU Split up the patches into smaller changes v8 Fixed an additional race in 3/8 which was not handled correctly. v7 Corrected issues reported by Jason Gunthorpe Link: https://lore.kernel.org/linux-rdma/20211207190947.GH6385@xxxxxxxxxx/ Link: https://lore.kernel.org/linux-rdma/20211207191857.GI6385@xxxxxxxxxx/ Link: https://lore.kernel.org/linux-rdma/20211207192824.GJ6385@xxxxxxxxxx/ v6 Fixed a kzalloc flags bug. Fixed comment bug reported by 'Kernel Test Robot'. Changed type of rxe_pool.c in __rxe_fini(). v5 Removed patches already accepted into for-next and addressed comments from Jason Gunthorpe. v4 Restructured patch series to change to xarray earlier which greatly simplified the changes. Rebased to current for-next v3 Changed rxe_alloc to use GFP_KERNEL Addressed other comments by Jason Gunthorp Merged the previous 06/10 and 07/10 patches into one since they overlapped Added some minor cleanups as 10/10 v2 Rebased to current for-next. Added 4 additional patches Bob Pearson (26): RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem RDMA/rxe: Enforce IBA o10-2.2.3 RDMA/rxe: Remove rxe_drop_all_macst_groups RDMA/rxe: Remove qp->grp_lock and qp->grp_list RDMA/rxe: Use kzmalloc/kfree for mca RDMA/rxe: Rename grp to mcg and mce to mca RDMA/rxe: Introduce RXECB(skb) RDMA/rxe: Split rxe_rcv_mcast_pkt into two phases RDMA/rxe: Replace locks by rxe->mcg_lock RDMA/rxe: Replace pool key by rxe->mcg_tree RDMA/rxe: Remove key'ed object support RDMA/rxe: Remove mcg from rxe pools RDMA/rxe: Add code to cleanup mcast memory RDMA/rxe: Add comments to rxe_mcast.c RDMA/rxe: Separate code into subroutines RDMA/rxe: Convert mca read locking to RCU RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC RDMA/rxe: Delete _locked() APIs for pool objects RDMA/rxe: Replace obj by elem in declaration RDMA/rxe: Replace red-black trees by xarrays RDMA/rxe: Change pool locking to RCU RDMA/rxe: Add wait_for_completion to pool objects RDMA/rxe: Fix ref error in rxe_av.c RDMA/rxe: Replace mr by rkey in responder resources drivers/infiniband/sw/rxe/rxe.c | 107 +--- drivers/infiniband/sw/rxe/rxe_av.c | 19 +- drivers/infiniband/sw/rxe/rxe_hdr.h | 3 + drivers/infiniband/sw/rxe/rxe_loc.h | 33 +- drivers/infiniband/sw/rxe/rxe_mcast.c | 678 ++++++++++++++++------ drivers/infiniband/sw/rxe/rxe_mr.c | 2 +- drivers/infiniband/sw/rxe/rxe_mw.c | 11 +- drivers/infiniband/sw/rxe/rxe_net.c | 35 +- drivers/infiniband/sw/rxe/rxe_pool.c | 798 ++++++++++---------------- drivers/infiniband/sw/rxe/rxe_pool.h | 233 +++----- drivers/infiniband/sw/rxe/rxe_qp.c | 29 +- drivers/infiniband/sw/rxe/rxe_recv.c | 98 ++-- drivers/infiniband/sw/rxe/rxe_req.c | 55 +- drivers/infiniband/sw/rxe/rxe_resp.c | 125 ++-- drivers/infiniband/sw/rxe/rxe_verbs.c | 54 +- drivers/infiniband/sw/rxe/rxe_verbs.h | 26 +- 16 files changed, 1159 insertions(+), 1147 deletions(-) rewrite drivers/infiniband/sw/rxe/rxe_mcast.c (86%) rewrite drivers/infiniband/sw/rxe/rxe_pool.c (67%) rewrite drivers/infiniband/sw/rxe/rxe_pool.h (73%) -- 2.32.0