Hi Devesh, Selvin, Somnath, > -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma- > owner@xxxxxxxxxxxxxxx] On Behalf Of Jason Gunthorpe > Sent: Monday, January 30, 2017 1:54 PM > To: Roland Dreier <roland@xxxxxxxxxxxxxxx> > Cc: linux-rdma@xxxxxxxxxxxxxxx; Matan Barak <matanb@xxxxxxxxxxxx> > Subject: Re: Need to set if_index in ib_init_ah_from_wc() ? > > On Sun, Jan 29, 2017 at 05:21:43PM -0800, Roland Dreier wrote: > > I recently debugged an issue where an RDMA-CM / RoCE connection to a > > link-local IPv6 address takes 1 second to establish. The delay comes > > when the passive side cm_req_handler() calls > > cm_init_av_for_response(), which calls ib_init_ah_from_wc(). > > > > That calls into rdma_addr_find_l2_eth_by_grh() with the link-local > > address, but with 0 passed in through the if_index parameter. So > > > > The fix seems as simple as adding > > > > if_index = idev->ifindex; > > > > to ib_init_ah_from_wc() before the call to > > rdma_addr_find_l2_eth_by_grh() - this can't fail, since we return an > > error if finding idev fails. > > Hum, it is certainly a big conceptual mistake to not anchor a route lookup in a > netdev at this point. > > Both roceev1 and v2 need to use the netdev that was used to create the UD QP > to anchor these lookups, that netdev should be a child of some kind from the > get_netdev (eg a vlan, or macvlan, etc) > > This is necessary to make namespaces and other subtle things work properly. > > So, right idea, wrong netdev. > > > However, I'm not sure if it's quite that simple - how does this > > interact with RoCEv2, or other addressing like IPv4 or non-link-local > > IPv6? I'm not sure I understand what the cases where resolved_dev will > > be != idev, or if passing in idev->ifindex will break this. > > I think this brokenness fell out of the above confusion, it should be removed, > rdma_addr_find_dmac_by_grh should not be callable without a proper netdev. > > The != case can happen if the gid table resolve differently than the route table, > but it is conceptually wrong to lookup in the gid table before doing a route > lookup, and I think these apis are poorly designed to enforce that requirement. > > Jason I have fixed this issue where rdma_addr_find_l2_eth_by_grh() accepts netdev of the GID. Derivation of GID is based on vlan, and DGID of the incoming GRH, network_type fields. Finding out the right GID depends above fields from the cached gid table. Before I post the patch, We found that bnx_re driver doesn't set IB_WC_WITH_VLAN in the QP1 wc. Fail to set vlan id and IB_WC_WITH_VLAN flag when incoming QP1 packet in vlan, will fail to search the right GID. Please also see recent important fix in qedr driver [1] who now correctly sets IB_WC_WITH_VLAN, in case if you need reference. Most drivers are setting it correctly now. Will you please fix this issue in bnx_re driver as well - to set IB_WC_WITH_VLAN, vlan_id, SL when vlan is present in QP1 UD receive packets? Please let me know in case if you need more information. [1] https://www.spinics.net/lists/linux-rdma/msg55115.html Parav -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html