Re: [PATCH rdma-next v2 08/11] RDMA/erdma: Add connection management (CM) support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 1/19/22 5:56 PM, Bernard Metzler wrote:


-----Original Message-----
From: Cheng Xu <chengyou@xxxxxxxxxxxxxxxxx>
Sent: Wednesday, 19 January 2022 04:58
To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; jgg@xxxxxxxx;
dledford@xxxxxxxxxx
Cc: leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx;
KaiShen@xxxxxxxxxxxxxxxxx; tonylu@xxxxxxxxxxxxxxxxx
Subject: [EXTERNAL] Re: [PATCH rdma-next v2 08/11] RDMA/erdma: Add
connection management (CM) support



On 1/18/22 10:49 PM, Bernard Metzler wrote:


<...>

+		cm_id = cep->listen_cep->cm_id;
+
+		event.ird = cep->dev->attrs.max_ird;
+		event.ord = cep->dev->attrs.max_ord;

Provide to the user also the negotiated  IRD/ORD of the
reply. Things may have changed upon peer's request.
See current siw code for the details.


IRD/ORD in ERDMA hardware is fixed, no need to negotiate them in MPA
request/reply now. For this reason, we didn't follow siw with MPA v2.

How is that working? Is the idea that the adapter implements a fixed
value which (hopefully) always exceeds any ULP requested IRD/ORD?

Yes, for better IRQ/ORQ queue buffer alignment in hardware, we use a fix
depth in our device. ULPs call iw_connect()/iw_accept() with ird/ord in
iw_cm_conn_param, if exceed, return -EINVAL immediately, this ensures
erdma always have enough IRQ/ORQ resources.

In any case, the negotiated (even if fixed) value MUST be provided
to the ULP's at both ends. In the erdma case, it is likely more than the
ULP was asking for. See RFC 5040, section 6.1
https://datatracker.ietf.org/doc/html/rfc5040#section-6.1


I think the values of IRD/ORD does influence the behavior of ULPs,
the purpose of IRD/ORD negotiation is making sure that both ends of the
RNICs have enough resources, otherwise overflow may happens.

For ULPs, They always can post as many RDMA Read as they can (not exceed
the max_send_wr), no matter the value of IRD/ORD is, RDMA Read will be
delayed if ORD value goes to zero [1]. In erdma case with fix ORQ/IRQ
depth, the ORD/IRD is more likes a flow control credit. And we will
consider this.

[1] https://datatracker.ietf.org/doc/html/draft-hilland-rddp-verbs-00#section-6.5




+	} else {
+		cm_id = cep->cm_id;
+	}
+
+	if (reason == IW_CM_EVENT_CONNECT_REQUEST ||
+	    reason == IW_CM_EVENT_CONNECT_REPLY) {
+		u16 pd_len = be16_to_cpu(cep->mpa.hdr.params.pd_len);
+

Does erdma support MPA protocol version 2, and enhanced connection
setup protocol? In that case, some private data contain protocol
information and must be hidden to the user.


Now we follow MPA v1. And due to specially network environment in Cloud
VPC, we extend the MPA v1: We exchange information with a extend header,
which followed with original MPA v1 header.

This is control information placed between MPAv1 header and ULP's
private data? So erdma is not interoperable with a device implementing
IETF iWarp?


Yes, but this is not the major reason. As I mentioned in the cover
letter, erdma is a RNIC provided by our MOC hardware in VPC
environment, and we do not sell erdma cards (indeed no single physical
cards, erdma is generated & accelerated by MOC) but the the VMs or bare
metals with ERDMA feature. No other iWarp devices in VPC physically, and
ERDMA can communicate with.

Thanks,
Cheng Xu,



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux