On Tue, Oct 12, 2021 at 03:19:46PM -0500, Bob Pearson wrote: > On 10/12/21 1:34 AM, Leon Romanovsky wrote: > > On Sun, Oct 10, 2021 at 06:59:25PM -0500, Bob Pearson wrote: > >> There are possible race conditions related to attempting to access > >> rxe pool objects at the same time as the pools or elements are being > >> freed. This series of patches addresses these races. > > > > Can we get rid of this pool? > > > > Thanks > > > >> > >> Bob Pearson (6): > >> RDMA/rxe: Make rxe_alloc() take pool lock > >> RDMA/rxe: Copy setup parameters into rxe_pool > >> RDMA/rxe: Save object pointer in pool element > >> RDMA/rxe: Combine rxe_add_index with rxe_alloc > >> RDMA/rxe: Combine rxe_add_key with rxe_alloc > >> RDMA/rxe: Fix potential race condition in rxe_pool > >> > >> drivers/infiniband/sw/rxe/rxe_mcast.c | 5 +- > >> drivers/infiniband/sw/rxe/rxe_mr.c | 1 - > >> drivers/infiniband/sw/rxe/rxe_mw.c | 5 +- > >> drivers/infiniband/sw/rxe/rxe_pool.c | 235 +++++++++++++------------- > >> drivers/infiniband/sw/rxe/rxe_pool.h | 67 +++----- > >> drivers/infiniband/sw/rxe/rxe_verbs.c | 10 -- > >> 6 files changed, 140 insertions(+), 183 deletions(-) > >> > >> -- > >> 2.30.2 > >> > > Not sure which 'this' you mean? This set of patches is motivated by someone at HPE > running into seg faults caused very infrequently by rdma packets causing seg faults > when trying to copy data to or from an MR. This can only happen (other than just dumb > bug which doesn't seem to be the case) by a late packet arriving after the MR is > de-registered. The root cause of that is the way rxe currently defers cleaning up > objects with krefs and potential races between cleanup and new packets looking up > rkeys. I found a lot of potential race conditions and tried to close them off. There > are another couple of patches coming as well. I have no doubts that this series fixes RXE, but my request was more general. Is there way/path to remove everything declared in rxe_pool.c|h? Thanks