On Mon, May 23, 2022 at 12:36 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote: > > Currently the rxe driver checks if any previous operation > is not complete to determine if a fence wait is required. > This is not correct. For a regular fence only previous > read or atomic operations must be complete while for a local > invalidate fence all previous operations must be complete. > This patch corrects this behavior. > > Fixes: 8700e3e7c4857 ("Soft RoCE (RXE) - The software RoCE driver") > Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > --- > drivers/infiniband/sw/rxe/rxe_req.c | 42 ++++++++++++++++++++++++----- > 1 file changed, 36 insertions(+), 6 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c > index ae5fbc79dd5c..f36263855a45 100644 > --- a/drivers/infiniband/sw/rxe/rxe_req.c > +++ b/drivers/infiniband/sw/rxe/rxe_req.c > @@ -163,16 +163,41 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp) > (wqe->state != wqe_state_processing))) > return NULL; > > - if (unlikely((wqe->wr.send_flags & IB_SEND_FENCE) && > - (index != cons))) { > - qp->req.wait_fence = 1; > - return NULL; > - } > - > wqe->mask = wr_opcode_mask(wqe->wr.opcode, qp); > return wqe; > } > > +/** > + * rxe_wqe_is_fenced - check if next wqe is fenced > + * @qp: the queue pair > + * @wqe: the next wqe > + * > + * Returns: 1 if wqe is fenced (needs to wait) > + * 0 if wqe is good to go > + */ > +static int rxe_wqe_is_fenced(struct rxe_qp *qp, struct rxe_send_wqe *wqe) > +{ > + unsigned int cons; > + > + if (!(wqe->wr.send_flags & IB_SEND_FENCE)) > + return 0; > + > + cons = queue_get_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT); > + > + /* Local invalidate fence (LIF) see IBA 10.6.5.1 > + * Requires ALL previous operations on the send queue > + * are complete. > + */ > + if (wqe->wr.opcode == IB_WR_LOCAL_INV) > + return qp->req.wqe_index != cons; Do I understand correctly that according to this code a wr with opcode IB_WR_LOCAL_INV needs to have the IB_SEND_FENCE also set for this to work? If that is the desired behaviour, can you point out where in spec this is mentioned. Thanks. > + > + /* Fence see IBA 10.8.3.3 > + * Requires that all previous read and atomic operations > + * are complete. > + */ > + return atomic_read(&qp->req.rd_atomic) != qp->attr.max_rd_atomic; > +} > + > static int next_opcode_rc(struct rxe_qp *qp, u32 opcode, int fits) > { > switch (opcode) { > @@ -636,6 +661,11 @@ int rxe_requester(void *arg) > if (unlikely(!wqe)) > goto exit; > > + if (rxe_wqe_is_fenced(qp, wqe)) { > + qp->req.wait_fence = 1; > + goto exit; > + } > + > if (wqe->mask & WR_LOCAL_OP_MASK) { > ret = rxe_do_local_ops(qp, wqe); > if (unlikely(ret)) > > base-commit: c5eb0a61238dd6faf37f58c9ce61c9980aaffd7a > -- > 2.34.1 >