On 3/11/25 7:11 PM, Sean Hefty wrote: >> I am not sure if a new subsystem is what this RFC calls for, but rather a >> discussion about the proper integration of a new RDMA transport into the >> Linux kernel. >> >> Ultra Ethernet Transport is probably not just another transport up for easy >> integration into the current RDMA subsystem. >> First of all, its design does not follow the well-known RDMA verbs model >> inherited from InfiniBand, which has largely shaped the current structure of >> the RDMA subsystem. While having send, receive and completion queues (and >> completion counters) to steer message exchange, there is no concept of a >> queue pair. Endpoints can span multiple queues, can have multiple peer >> addresses. >> Communication resources sharing is controlled in a different way than within >> protection domains. Connections are ephemeral, created and released by the >> provider as needed. There are more differences. In a nutshell, the UET >> communication model is trimmed for extreme scalability. Its API semantics >> follow libfabrics, not RDMA verbs. >> >> I think Nik gave us a first still incomplete look at the UET protocol engine to >> help us understand some of the specifics. >> It's just the lower part (packet delivery). The implementation of the upper part >> (resource management, communication semantics, job management) may >> largely depend on the environment we all choose. >> >> IMO, integrating UET with the current RDMA subsystem would ask for its >> extension to allow exposing all of UETs intended functionality, probably >> starting with a more generic RDMA device model than current ib_device. >> >> The different API semantics of UET may further call for either extending verbs >> to cover it as well, or exposing a new non-verbs API (libfabrics), or both. > > Reading through the submissions, what I found lacking is a description of some higher-level plan. I don't easily see how to relate this series to NICs that may implement UET in HW. > > Should the PDS be viewed as a partial implementation of a SW UET 'device', similar to soft RoCE or iWarp? If so, having a description of a proposed device model seems like a necessary first step. > Hi Sean, To quote the cover letter: "...As there isn't any UET hardware available yet, we introduce a software device model which implements the lowest sublayer of the spec - PDS..." and "The plan is to have that split into core Ultra Ethernet module (ultraeth.ko) which is responsible for managing the UET contexts, jobs and all other common/generic UET configuration, and the software UET device model (uecon.ko) which implements the UET protocols for communication in software (e.g. the PDS will be a part of uecon) and is represented by a UDP tunnel network device." So as I said, it is in very early stage, but we plan to split this into core UET code and uecon software device model that implements the UEC specs. > If, instead, the PDS should be viewed more along the lines of a partial RDS-like path, then that changes the uapi. > > Or, am I not viewing this series as intended at all? > > It is almost guaranteed that there will be NICs which will support both RoCE and UET, and it's not farfetched to think that an app may use both simultaneously. IMO, a common device model is ideal, assuming exposing a device model is the intent. > That is the goal and we're working on UET kernel device API as I've noted in the cover letter. > I agree that different transport models should not be forced together unnaturally, but I think that's solvable. In the end, the application developer is exposed to libfabric naming anyway. Besides, even a repurposed RDMA name is still better than the naming used within OpenMPI. :) > > - Sean Cheers, Nik