On 7/27/21 7:02 AM, Leon Romanovsky wrote: > On Tue, Jul 27, 2021 at 06:31:59AM -0500, Bob Pearson wrote: >> On 7/27/21 6:30 AM, Leon Romanovsky wrote: >>> On Mon, Jul 26, 2021 at 04:58:16PM -0500, Bob Pearson wrote: >>>> Currently several rxe objects hold references to PDs which are ref- >>>> counted. This replicates work already done by RDMA core which takes >>>> references to PDs in the ib objects which are contained in the rxe >>>> objects. This patch removes struct rxe_pd from rxe objects and removes >>>> reference counting for PDs except for PD alloc and PD dealloc. It also >>>> adds inline extractor routines which return PDs from the PDs in the >>>> ib objects. The names of these are made consistent including a rxe_ >>>> prefix. >>>> >>>> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> >>>> --- >>>> drivers/infiniband/sw/rxe/rxe_comp.c | 4 ++-- >>>> drivers/infiniband/sw/rxe/rxe_loc.h | 4 ++-- >>>> drivers/infiniband/sw/rxe/rxe_mr.c | 8 +++---- >>>> drivers/infiniband/sw/rxe/rxe_mw.c | 31 +++++++++++---------------- >>>> drivers/infiniband/sw/rxe/rxe_qp.c | 9 +------- >>>> drivers/infiniband/sw/rxe/rxe_req.c | 2 +- >>>> drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++-- >>>> drivers/infiniband/sw/rxe/rxe_verbs.c | 26 ++++++---------------- >>>> drivers/infiniband/sw/rxe/rxe_verbs.h | 24 +++++++++++++++------ >>>> 9 files changed, 48 insertions(+), 64 deletions(-) >>> >>> Last time when I looked on it, I came to conclusion that all RXE >>> references can be dropped. >>> >>> Thanks >>> >> This is a step in that direction. There are more coming. > > Glad to hear, thank you for your work. > >> Regards, >> >> Bob The other ones I can immediately get rid of are AHs, CQs and SRQs. The ones I think may require ref counting are MRs, MWs, QPs, and XRCSRQs. Each of these get looked up from information in RoCE packets from rkeys, qpns, and srq_nums, For reliable transports on slow networks this can require that these objects hang around for a while and the user has no visibility to this unless there is a completion event waiting but not for e.g. reads, writes or atomics on the target side. There can be races between destroying these objects and messages completing causing kernel oops. Do you know another way to address these cases? Bob