> On 18 Mar 2021, at 20:21, Haakon Bugge <haakon.bugge@xxxxxxxxxx> wrote: > > With the introduction of RoCE systems, a CM REQ message will contain (pasted from Wireshark): > > Primary Hop Limit: 0x40 > .... 0... = Primary Subnet Local: 0x0 > > This because cma_resolve_iboe_route() has: > > if (((struct sockaddr *)&id_priv->id.route.addr.dst_addr)->sa_family != AF_IB) > /* TODO: get the hoplimit from the inet/inet6 device */ > route->path_rec->hop_limit = addr->dev_addr.hoplimit; > else > route->path_rec->hop_limit = 1; > > The addr->dev_addr.hoplimit is coming from addr4_resolve(), which has: > > addr->hoplimit = ip4_dst_hoplimit(&rt->dst); > > ip4_dst_hoplimit() returns the value of the sysctl net.ipv4.ip_default_ttl in this case (64). > > For the purpose of this case, consider the CM REQ to have the Primary SL != 0. > > When this REQ gets processed by cm_req_handler(), the cm_process_routed_req() function is called. > > Since the Primary Subnet Local value is zero in the request, and since this is RoCE (Primary Local LID is permissive), the following statement will be executed: > > IBA_SET(CM_REQ_PRIMARY_SL, req_msg, wc->sl); > > At least on the system I ran on, which was equipped with a Mellanox CX-5 HCA, the wc->sl is zero. Hence, the request to setup a connection using an SL != zero, will not be honoured, and a connection using SL zero will be created instead. > > As a side note, in cm_process_routed_req(), we have: > > IBA_SET(CM_REQ_PRIMARY_REMOTE_PORT_LID, req_msg, wc->dlid_path_bits); > > which is strange, since a LID is 16 bits, whereas dlid_path_bits is only eight. > > I am uncertain about the correct fix here. Any input appreciated. I intend to send a patch doing: --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -2138,7 +2138,8 @@ static int cm_req_handler(struct cm_work *work) goto destroy; } - cm_process_routed_req(req_msg, work->mad_recv_wc->wc); + if (cm_id_priv->av.ah_attr.type != RDMA_AH_ATTR_TYPE_ROCE) + cm_process_routed_req(req_msg, work->mad_recv_wc->wc); memset(&work->path[0], 0, sizeof(work->path[0])); if (cm_req_has_alt_path(req_msg)) if I do not get a push back. Thxs, Håkon