Re: [PATCH for-next V2 05/11] IB/core: Add rdma_network_type to wc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Eh? I think you've missed the point, there is no net device when
> looking at a wc.
>
> Look, here is a concrete direction:
>
> Replace all the crap in
> ib_init_ah_from_wc/get_sgid_index_from_eth/rdma_addr_find_dmac_by_grh
>
> with a straightforward
>
>    rdma_dgid_index_from_wc(
>                         const struct ib_qp *qp,
>                         const struct ib_wc *wc,
>                         const struct ib_grh *grh,
>                         u16 *gid_index)
>
> Sort of function that reads the GRH and wc and returns the unambiguous
> gid index that was used to receive that packet on the UD QP.
>

I already answered this to but I'll do it again
RoCEv2 spec says that L3 header will be scattered to receive WQE in
the following way
IPv6 and RoCEv1 - 40 bytes of the L3 header (GRH or IPv6) to the first
40 bytes of the receive bufs
IPv4 - 20 bytes of the L3 header to the second half of the first 40
bytes of the receive bufs. The first 20 bytes remain undefined.

Now, if you think how you deduce network_type from GRH you'll see that
it requires tools like checksum validation and other validations and
you end up with a method that is not 100% error free. So,to eliminate
the need for heavy computation (with regards to the other option) and
be free from false deductions you have the option of getting
network_type from the hardware. So, if you do have hardware that
supports it why give it up?



> The gid cache is not allowed to create an ambiguity the driver cannot
> resolve.
>
> That said, I wouldn't object to vendor-specific bits in the wc. Ie if
> mlx hardware needs a network_type bit to implement
> rdma_find_dgid_index_from_wc, then fine - define a vendor specific
> place to put it. In this case rdma_find_dgid_index_from_wc would be a
> driver call back, which is fine, and what Caitlin was talking about.
>

This is not a Mellanox specific flag. See a quote from the spec

A17.4.5.1 UD COMPLETION QUEUE ENTRIES (CQES)
For UD, the Completion Queue Entry (CQE) includes remote address
information (InfiniBand Specification Vol. 1 Rev 1.2.1 Section
11.4.2.1). For RoCEv2, the remote address information comprises the
source L2 Address and a flag that indicates if the received frame is
an IPv4, IPv6 or RoCE packet.



> But, it is not part of our verbs API, and I'd *strongly* encourage
> other vendors and future hardware to simply return the gid index that
> the hardware matched instead of requiring the software to try and
> guess after the fact.

Could be problematic for virtual machine architectures that give a
portion of the entire GID table to a VM that index it 0..N
>
> This is the same issue/argument we went around and around on the
> lladdr ipoib details when working on the namespace patches, about how
> important it is to resolve the namespace from the hardware headers.
>
> Of course once we have the gid index we now have the net device and
> other information needed to make namespaces work.
>
> .. and this is part of what I mean what I said the interface from the
> gid cache code is not a sane API and needs to be changed.
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux