On 1/8/2020 10:26 PM, Alex Rosenbaum wrote: > A combination of the flow_label field in the IPv6 header and UDP source port > field in RoCE v2.0 are used to identify a group of packets that must be > delivered in order by the network, end-to-end. > These fields are used to create entropy for network routers (ECMP), load > balancers and 802.3ad link aggregation switching that are not aware of RoCE IB > headers. > > The flow_label field is defined by a 20 bit hash value. CM based connections > will use a hash function definition based on the service type (QP Type) and > Service ID (SID). Where CM services are not used, the 20 bit hash will be > according to the source and destination QPN values. > Drivers will derive the RoCE v2.0 UDP src_port from the flow_label result. > > UDP source port selection must adhere IANA port allocation ranges. Thus we will > be using IANA recommendation for Ephemeral port range of: 49152-65535, or in > hex: 0xC000-0xFFFF. > > The below calculations take into account the importance of producing a symmetric > hash result so we can support symmetric hash calculation of network elements. > > Hash Calculation for RDMA IP CM Service > ======================================= > For RDMA IP CM Services, based on QP1 iMAD usage and connected RC QPs using the > RDMA IP CM Service ID, the flow label will be calculated according to IBTA CM > REQ private data info and Service ID. > > Flow label hash function calculations definition will be defined as: > Extract the following fields from the CM IP REQ: > CM_REQ.ServiceID.DstPort [2 Bytes] > CM_REQ.PrivateData.SrcPort [2 Bytes] > u32 hash = DstPort * SrcPort; > hash ^= (hash >> 16); > hash ^= (hash >> 8); > AH_ATTR.GRH.flow_label = hash AND IB_GRH_FLOWLABEL_MASK; > > #define IB_GRH_FLOWLABEL_MASK 0x000FFFFF > > Result of the above hash will be kept in the CM's route path record connection > context and will be used all across its vitality for all preceding CM messages > on both ends of the connection (including REP, REJ, DREQ, DREP, ..). > Once connection is established, the corresponding Connected RC QPs, on both > ends of the connection, will update their context with the calculated RDMA IP > CM Service based flow_label and UDP src_port values at the Connect phase of > the active side and Accept phase of the passive side of the connection. > > CM will provide to the calculated value of the flow_label hash (20 bit) result > in the 'uint32_t flow_label' field of 'struct ibv_global_route' in 'struct > ibv_ah_attr'. > The 'struct ibv_ah_attr' is passed by the CM to the provider library when > modifying a connected QP's (e.g.: RC) state by calling 'ibv_modify_qp(qp, > ah_attr, attr_mask |= IBV_QP_AV)' or when create a AH for working with > datagram QP's (e.g.: UD) by calling ibv_create_ah(ah_attr). > > Hash Calculation for non-RDMA CM Service ID > =========================================== > For non CM QP's, the application can define the flow_label value in the > 'struct ibv_ah_attr' when modifying the connected QP's (e.g.: RC) or creating > a AH for the datagram QP's (e.g.: UD). > Hi Alex, when creating an AH for the datagram QP, I think we don't have the src.QP and dst.QP, so we can't set the flow_label here? > If the provided flow_label value is zero, not set by the application (e.g.: > legacy cases), then verbs providers should use the src.QP[24bit] and > dst.QP[24bit] as input arguments for flow_label calculation. > As QPN's are an array of 3 bytes, the multiplication will result in 6 bytes > value. We'll define a flow_label value as: > DstQPn [3 Bytes] > SrcQPn [3 Bytes] > u64 hash = DstQPn * SrcQPn; > hash ^= (hash >> 20); > hash ^= (hash >> 40); > AH_ATTR.GRH.flow_label = hash AND IB_GRH_FLOWLABEL_MASK; > > Hash Calculation for UDP src_port > ================================= > Providers supporting RoCEv2 will use the 'flow_label' value as input to > calculate the RoCEv2 UDP src_port, which will be used in the QP context or the > AH context. > > UDP src_port calculations from flow label: > [while considering the 14 bits UDP port range according to IANA recommendation] > AH_ATTR.GRH.flow_label [20 bits] > u32 fl_low = fl & 0x03FFF; > u32 fl_high = fl & 0xFC000; > u16 udp_sport = fl_low XOR (fl_high >> 14); > RoCE.UDP.src_port = udp_sport OR IB_ROCE_UDP_ENCAP_VALID_PORT > > #define IB_ROCE_UDP_ENCAP_VALID_PORT 0xC000 > > This is a v2 follow-up on "[RFC] RoCE v2.0 UDP Source Port Entropy" [1] > > [1] https://www.spinics.net/lists/linux-rdma/msg73735.html > > Signed-off-by: Alex Rosenbaum <alexr@xxxxxxxxxxxx> >