On 2025/3/20 0:48, Jason Gunthorpe wrote: > On Fri, Mar 07, 2025 at 01:01:50AM +0200, Nikolay Aleksandrov wrote: >> Hi all, >> This patch-set introduces minimal Ultra Ethernet driver infrastructure and >> the lowest Ultra Ethernet sublayer - the Packet Delivery Sublayer (PDS), >> which underpins the entire communication model of the Ultra Ethernet >> Transport[1] (UET). Ultra Ethernet is a new RDMA transport designed for >> efficient AI and HPC communication. > > I was away while this discussion happened so I've gone through and > read the threads, looked at the patches and I don't think I've changed > my view since I talked to Enfabrica privately on this topic almost a > year ago. > > I do not agree with creating a new subsystem (or whatever you are > calling drivers/ultraeth) for a single RDMA protocol and see nothing > new here to change my mind. I would likely NAK the direction I see in > this RFC, as I have other past attempts to build RDMA HW interfaces > outside of the RDMA subystem. > > Since none of that past discussion seems to have been acknowledged or > rebutted in this series I will repeat the main points: > > 1) I'm aware of something like 5-7 new protocols that are competing > for the same market as Ultra Ethernet. We can't give everyone and > their dog a new subsystem (or whatever) and all the maintainability > negatives that come with that. As a matter of maintainability we > need to see consolidation here, not fragmentation! > > Yes, UE is a consortium driven standard, which is unique and a big > positive, but I don't believe anyone can say for certain what > direction the industry is going to go in. Many consortium standards > have failed to get adoption in the past even with a large number of > member companies. > > Nor can we know what concepts in UE are going to be copied into > other competing RDMA transports. See my other remarks on job key > for an example. Prematurely siloing stuff in drivers/ultraeth is > very much the wrong technical direction for maintainability. > > That said, I think UE should be in the kernel and have a fair > chance to compete for market share. Just in a maintainable and > appropriate way while the industry evolves. > > 2) Due to the above, I'm pretty confident we will see RDMA NICs > supporting a lot of different protocols. In fact they already do. > > From a kernel maintainability perspective we really want one RDMA > driver leveraging as much common infrastructure between the > protocols as possible. We do not want to see a single HW driver > further split up needlessly to other subsystems, that would be a > big maintainability downside. > > To put a clear point on this, mlx5 has been gaining new protocols > and fitting into the existing driver model for a number of years > now. In fact there is speculation that UE could be implemented in > mlx5 RDMA with minimal kernel changes. There would be no reason to > try to mess up the driver to also interact with this stuff in > drivers/ultraeth as seems to be proposed here. > > I think other HW will be similar. UE isn't so radically different > that every HW path will need to diverge from classical RDMA. Nor is > is so dissimilar to other competing proposals. We don't want > artificial differences we want to create things that can be re-used > when appropriate. > > Leon's response to Bart is correct, we already have similar > examples of almost everything UE does. Bart is also correct that > verbs would be a PITA, but RDMA userspace has moved beyond verbs > limitations years ago now. Alot of mlx5 stuff is not using verbs > today, for instance. EFA and other examples use extensive stuff > beyond verbs. Regarding to reuse the existing rdma subsystem for a new protocol: Currently EFA seems to be layering a RDM layer on top of the SRD transport layer, see [1], and RDM layer is implemented by software in the libfabric while SRD seems to be implemented by hardware, which provides 'Scalable Reliable Datagram' service through the QP type of EFA_QP_DRIVER_TYPE_SRD. I am not sure if layers like SRD and RDM are clean layering from protocol design perspective. But if the hardware implement both SRD and RDM layer in hardware, then there might be two types of object need managing, SRD object might be shared between different applications, and RDM object need to be created based on a SRD object. As the existing rdma subsystem doesn't seems to support the above use case yet and as we are discussing a possible new subsystem or updating existing subsystem to support new protocol here, it would be good to discuss if it is possible to support the above case or another new subsystem is needed for that use case too. 1. https://github.com/ofiwg/libfabric/blob/main/prov/efa/docs/efa_rdm_protocol_v4.md