Re: Need to set if_index in ib_init_ah_from_wc() ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 09, 2017 at 11:24:17AM +0200, Matan Barak wrote:
> On Thu, Feb 9, 2017 at 2:15 AM, Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> >
> >> From: Jason Gunthorpe [mailto:jgunthorpe@xxxxxxxxxxxxxxxxxxxx]
> >> Sent: Wednesday, February 8, 2017 6:02 PM
> >> To: Parav Pandit <parav@xxxxxxxxxxxx>
> >> Cc: Matan Barak <matanb@xxxxxxxxxxxxxxxxxx>; Roland Dreier
> >> <roland@xxxxxxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Matan Barak
> >> <matanb@xxxxxxxxxxxx>
> >> Subject: Re: Need to set if_index in ib_init_ah_from_wc() ?
> >>
> >> On Thu, Feb 09, 2017 at 12:00:09AM +0000, Parav Pandit wrote:
> >>
> >> > > That still breaks link local addresses on vlan devices, so it is an
> >> > > ugly hack, not a solution.
> >> >
> >> > In presence of vlan, shouldn't we be passing the ifindex of the vlan
> >> netdev?
> >>
> >> yes, that is exactly my point...
> >
> > Oh ok. I get it. I am on right path to fix it than.

Use the gid cache to figure out, not seaching netdevs..

> > Additionally,
> > when there is macvlan based slave device present on this vlan device, I will pass the ifindex of that particular netdev.
> > Now since we don't have MAC address coming in ib_wc nor in IB/RoCEv2 Annex spec, code needs to refer to the
> > (a) ifaddr of the vlan netdev
> > and
> > (b) ifaddr of the slave netdevs
> > Compare the DGID of the grh with ifaddr and use that netdev's ifindex for the first matching entry.
> >
> > Sounds reasonable now?
> >
> 
> Since we don't get the DMAC address, I think the GID cache shouldn't
> carry entries which the hardware can't differentiate upon.

Well, more specifically, with this limiatation, the hardware must
*NEVER* receive a packet that does not match the primary MAC of the
port.

Which goes back to my first point: The hardware should not receive
something that is not in the GID cache, period. It sounds like this
basic sanity is being viloated in some current rocee hardware???

If any scenario makes the GID cache ambiguous then it cannot be
allowed. eg apparently macvlan must be denied, which makes this pretty
useless for namespaces.

>From your comments, I think the hardware function is going to have to
be improved to make this sane. I continue to recommend returning the
GID cache index in the WC.

> It might be ok for some cases in the transmit side (as you choose
> the smac based on the netdev attached to the GID entry, but if you
> add a vxlan based interface, you won't be able to add the
> appropriate headers). We can leave this as is or making it
> symmetrical.

Again, it is madness to allow the hardware to receive a packet on a UD
QP that is not present in the GID table, and it is unworkable to have
a WC that doesn't unambiguously refer to a GID Table entry.

So yes, things like vxlan should not be in the gid table if the
hardware cannot cope with it.

> So, when adding a GID, we need to consult the hardware capabilities
> regarding the metadata it can provide in the completion. If the
> hardware isn't capable of creating/stripping one of the headers of
> this netdev, there's no reason to add it.

Yes. This is also why long ago I suggested that the hardware driver
should provide a function to resolve the WC into a GID cach entry and
that function can rely on hardware unique capabilities.

IMHO userspace should not be exposed to this and UD QPs should be
locked by hardware to a single netdev worth of gid cache
entries. Anything weaker invites exploitation when we talk about
namespaces.

> If the hardware supports creating/stripping the required headers but
> it doesn't support reporting them in the completion or all fields
> are supported but there are conflicting entries, you could either
> consult the ingress route before adding these GIDs or add them both
> and consult the

No. Hardware must support all features: create/strip/report/per-QP
filter before the GID cache can have an entry. No subsets can be
permitted.

This probably means existing firmware/hardware/drivers cannot support
macvlan and maybe others, but that is much better than trying to
support it in an unsafe and insecure way.

That probbably answers Parav's earlier question about duplicates in
the gid table: It is a bug today that can even happen.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux