On Wed, Mar 24, 2021 at 02:34:13PM +0000, Håkon Bugge wrote: > > > > On 23 Mar 2021, at 20:46, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > > On Mon, Mar 22, 2021 at 02:35:32PM +0100, Håkon Bugge wrote: > >> On RoCE systems, a CM REQ contains a Primary Hop Limit > 1 and Primary > >> Subnet Local is zero. > >> > >> In cm_req_handler(), the cm_process_routed_req() function is > >> called. Since the Primary Subnet Local value is zero in the request, > >> and since this is RoCE (Primary Local LID is permissive), the > >> following statement will be executed: > >> > >> IBA_SET(CM_REQ_PRIMARY_SL, req_msg, wc->sl); > >> > >> This corrupts SL in req_msg if it was different from zero. In other > >> words, a request to setup a connection using an SL != zero, will not > >> be honored, and a connection using SL zero will be created instead. > >> > >> Fixed by not calling cm_process_routed_req() on RoCE systems. > >> > >> Fixes: 3971c9f6dbf2 ("IB/cm: Add interim support for routed paths") > >> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx> > >> drivers/infiniband/core/cm.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c > >> index 3d194bb..6adbaea 100644 > >> +++ b/drivers/infiniband/core/cm.c > >> @@ -2138,7 +2138,8 @@ static int cm_req_handler(struct cm_work *work) > >> goto destroy; > >> } > >> > >> - cm_process_routed_req(req_msg, work->mad_recv_wc->wc); > >> + if (cm_id_priv->av.ah_attr.type != RDMA_AH_ATTR_TYPE_ROCE) > >> + cm_process_routed_req(req_msg, work->mad_recv_wc->wc); > > > > why use ah_attr.type when a few lines below we have: > > > > if (gid_attr && > > rdma_protocol_roce(work->port->cm_dev->ib_device, > > work->port->port_num)) { > > > > ? > > > > I suspect you can just move this into the else? > > I can counter that by saying ah_attr.type is used ~10 lines further > down in the conditional call to sa_path_set_dmac() ;-) Hum, OK. Please send an additional patch to unify everything around av.ah_attr.type > > if (gid_attr && > > rdma_protocol_roce(work->port->cm_dev->ib_device, > > work->port->port_num)) { > > I cannot really see how gid_attr could be null. If > ib_init_ah_attr_from_wc() succeeds, it is set after the call to > cm_init_av_for_response() above. May be using ah_attr.type in this > test instead, for uniformity and readability? The GRH is optional, ib_init_ah_attr_from_wc() only sets it conditionally. Applied to for-next Thanks, Jason