On Mon, May 23, 2022 at 5:51 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote: > > On 5/22/22 18:59, Haris Iqbal wrote: > > On Mon, May 23, 2022 at 12:36 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote: > >> > >> Currently the rxe driver checks if any previous operation > >> is not complete to determine if a fence wait is required. > >> This is not correct. For a regular fence only previous > >> read or atomic operations must be complete while for a local > >> invalidate fence all previous operations must be complete. > >> This patch corrects this behavior. > >> > >> Fixes: 8700e3e7c4857 ("Soft RoCE (RXE) - The software RoCE driver") > >> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > >> --- > >> drivers/infiniband/sw/rxe/rxe_req.c | 42 ++++++++++++++++++++++++----- > >> 1 file changed, 36 insertions(+), 6 deletions(-) > >> > >> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c > >> index ae5fbc79dd5c..f36263855a45 100644 > >> --- a/drivers/infiniband/sw/rxe/rxe_req.c > >> +++ b/drivers/infiniband/sw/rxe/rxe_req.c > >> @@ -163,16 +163,41 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp) > >> (wqe->state != wqe_state_processing))) > >> return NULL; > >> > >> - if (unlikely((wqe->wr.send_flags & IB_SEND_FENCE) && > >> - (index != cons))) { > >> - qp->req.wait_fence = 1; > >> - return NULL; > >> - } > >> - > >> wqe->mask = wr_opcode_mask(wqe->wr.opcode, qp); > >> return wqe; > >> } > >> > >> +/** > >> + * rxe_wqe_is_fenced - check if next wqe is fenced > >> + * @qp: the queue pair > >> + * @wqe: the next wqe > >> + * > >> + * Returns: 1 if wqe is fenced (needs to wait) > >> + * 0 if wqe is good to go > >> + */ > >> +static int rxe_wqe_is_fenced(struct rxe_qp *qp, struct rxe_send_wqe *wqe) > >> +{ > >> + unsigned int cons; > >> + > >> + if (!(wqe->wr.send_flags & IB_SEND_FENCE)) > >> + return 0; > >> + > >> + cons = queue_get_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT); > >> + > >> + /* Local invalidate fence (LIF) see IBA 10.6.5.1 > >> + * Requires ALL previous operations on the send queue > >> + * are complete. > >> + */ > >> + if (wqe->wr.opcode == IB_WR_LOCAL_INV) > >> + return qp->req.wqe_index != cons; > > > > > > Do I understand correctly that according to this code a wr with opcode > > IB_WR_LOCAL_INV needs to have the IB_SEND_FENCE also set for this to > > work? > > > > If that is the desired behaviour, can you point out where in spec this > > is mentioned. > > According to IBA "Local invalidate fence" (LIF) and regular Fence behave > differently. (See the referenced sections in the IBA.) For a local invalidate > operation the fence bit fences all previous operations. That was the old behavior > which made no distinction between local invalidate and other operations. > The change here are the other operations with a regular fence which should only > requires read and atomic operations to be fenced. > > Not sure what you mean by 'also'. Per the IBA if the LIF is set then you have > strict invalidate ordering if not then you have relaxed ordering. The kernel verbs > API only has one fence bit and does not have a separate LIF bit so I am > interpreting them to share the one bit. I see. Now I understand. Thanks for the explanation. I do have a follow-up question. For a IB_WR_LOCAL_INV wr, without the fence bit means relaxed ordering. This would mean that the completion for that wr must take place "before any subsequent WQE has begun execution". From what I understand, multiple rxe_requester instances can run in parallel and pick up wqes and execute them. How is the relaxed ordering criteria fulfilled? > > Bob > > > > Thanks. > > > > > >> + > >> + /* Fence see IBA 10.8.3.3 > >> + * Requires that all previous read and atomic operations > >> + * are complete. > >> + */ > >> + return atomic_read(&qp->req.rd_atomic) != qp->attr.max_rd_atomic; > >> +} > >> + > >> static int next_opcode_rc(struct rxe_qp *qp, u32 opcode, int fits) > >> { > >> switch (opcode) { > >> @@ -636,6 +661,11 @@ int rxe_requester(void *arg) > >> if (unlikely(!wqe)) > >> goto exit; > >> > >> + if (rxe_wqe_is_fenced(qp, wqe)) { > >> + qp->req.wait_fence = 1; > >> + goto exit; > >> + } > >> + > >> if (wqe->mask & WR_LOCAL_OP_MASK) { > >> ret = rxe_do_local_ops(qp, wqe); > >> if (unlikely(ret)) > >> > >> base-commit: c5eb0a61238dd6faf37f58c9ce61c9980aaffd7a > >> -- > >> 2.34.1 > >> >