RE: Need to set if_index in ib_init_ah_from_wc() ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason, Roland,

> -----Original Message-----
> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Jason Gunthorpe
> Sent: Monday, January 30, 2017 1:54 PM
> To: Roland Dreier <roland@xxxxxxxxxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx; Matan Barak <matanb@xxxxxxxxxxxx>
> Subject: Re: Need to set if_index in ib_init_ah_from_wc() ?
> 
> On Sun, Jan 29, 2017 at 05:21:43PM -0800, Roland Dreier wrote:
> > I recently debugged an issue where an RDMA-CM / RoCE connection to a
> > link-local IPv6 address takes 1 second to establish.  The delay comes
> > when the passive side cm_req_handler() calls
> > cm_init_av_for_response(), which calls ib_init_ah_from_wc().
> >
> > That calls into rdma_addr_find_l2_eth_by_grh() with the link-local
> > address, but with 0 passed in through the if_index parameter.  So
> >
> > The fix seems as simple as adding
> >
> >     if_index = idev->ifindex;
> >
> > to ib_init_ah_from_wc() before the call to
> > rdma_addr_find_l2_eth_by_grh() - this can't fail, since we return an
> > error if finding idev fails.
> 
> Hum, it is certainly a big conceptual mistake to not anchor a route lookup in a
> netdev at this point.
> 
> Both roceev1 and v2 need to use the netdev that was used to create the UD
> QP to anchor these lookups, that netdev should be a child of some kind from
> the get_netdev (eg a vlan, or macvlan, etc)
> 
As you know UD QP is not created with netdev (AH binds to netdev).
So AH and RC QPs are going to be anchored now around GID.

> This is necessary to make namespaces and other subtle things work properly.

Yes. I am currently testing my patches for this support in our setup.
I will post the patches soon.
However fix that Roland pointed out is needed regardless.
I actually hit ifindex case even with IPv4 last week but was unable to reproduce it.

> 
> So, right idea, wrong netdev.
> 
> > However, I'm not sure if it's quite that simple - how does this
> > interact with RoCEv2, or other addressing like IPv4 or non-link-local
> > IPv6? I'm not sure I understand what the cases where resolved_dev will
> > be != idev, or if passing in idev->ifindex will break this.
> 
> I think this brokenness fell out of the above confusion, it should be removed,
> rdma_addr_find_dmac_by_grh should not be callable without a proper
> netdev.
> 
IP route apis are working mostly based on the iface_id as input.
So I am thinking to continue with ifindex input argument instead of changing to netdev pointer.
However there is corner case where ifindex could become invalid between rdma_addr_find_l2_eth_by_grh() and dev_get_by_index().
So I am open for netdev as well. I will review this path one more time for possible netdev input.
Currently rdma_addr_find_l2_eth_by_grh() has ifindex as optional argument, which is removed in my sandbox now.

> The != case can happen if the gid table resolve differently than the route
> table, but it is conceptually wrong to lookup in the gid table before doing a
> route lookup, and I think these apis are poorly designed to enforce that
> requirement.
> 
Gid lookup is still done after the route lookup using get_sgid_index_from_eth().
ib_get_gids_from_rdma_hdr() just gets the GID formatted properly.
I think this part is ok?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux