On 05-Dec-18 17:15, Jason Gunthorpe wrote: > On Wed, Dec 05, 2018 at 10:55:59AM +0200, Gal Pressman wrote: >> On 04-Dec-18 17:33, Jason Gunthorpe wrote: >>> On Tue, Dec 04, 2018 at 02:04:16PM +0200, Gal Pressman wrote: >>>> Hello all, >>>> The following patchset introduces the Elastic Fabric Adapter (EFA) driver, that >>>> was pre-announced by Amazon [1]. >>>> >>>> EFA is a networking adapter designed to support user space network >>>> communication, initially offered in the Amazon EC2 environment. First release >>>> of EFA supports datagram send/receive operations and does not support >>>> connection-oriented or read/write operations. >>>> >>>> EFA supports unreliable datagrams (UD) as well as a new unordered, scalable >>>> reliable datagram protocol (SRD). SRD provides support for reliable datagrams >>>> and more complete error handling than typical RD, but, unlike RD, it does not >>>> support ordering nor segmentation. A new queue pair type, IB_QPT_SRD, is added >>>> to expose this new queue pair type. >>>> User verbs are supported via a dedicated userspace libfabric provider. >>>> Kernel verbs and in-kernel services are initially not supported. >>> >>> There are some general expectations for new drivers in rdma you should >>> be aware of.. >>> >>> 1) Accepting usnic to RDMA is widely regarded as a mistake. This means >>> that future drivers that do not implement 'enough' common verbs are >>> not likely to be accepted, as they are not RDMA devices. I'm >>> worried this driver doesn't cross the threshold. >> >> Hi Jason, >> What do you mean by enough common verbs? > > No one has really come up with a guideline, other than usnic was a > mistake. > >> we implement everything our device can support for now - and we >> provide a functional RDMA device. > > Is it a 'functional RDMA device'? It doesn't work with any kernel ULP, > it doesn't have a libibverbs provider, it doesn't implement any > pre-existing protocol... It seems just like usnic. > > .. and there is no documentation, what is the format for AH's on UD? > does it do the verbs memory layout with the 40 byte offset? etc. Just > like usnic :| The usage of AHs is similar to the IB spec. GID is used to create the AH, an opaque handle is returned from the device to be used in TX work requests. The opaque handle is reported in the RX work requests completions; in case the address handle was not yet registered, the device reports the GID in an extended completion entry. Therefore, we do not need the 40 byte offset memory layout. > >> Can you please be more specific as to what is missing? We can also >> implement a basic set of kernel verbs, but since we don't support RC >> QP type there will not be any use for them - I wouldn't want to add >> dead code to the driver just for the sake of having an >> implementation. > > I agree you shouldn't implement kernel verbs for UD. What is missing > is that your HW does not seem to be RDMA hardware. Implementing a > undocumented UD-only, no kverbs, a propritary protocol and > libfabric-only is exactly what usnic did, and is exactly why it has > been viewed as a mistake. > > Using rdma has a hacky way to do shared process vfio is not > really appropriate. EFA is much more similar to the IB spec than to vfio, it enforces PD protection, uses remote QP numbers addressing and standard AH/MR registration, does not allow userspace to send arbitrary packets and packet header construction is done on the device. > >>> 2) Any change to common verbs must be supported by full public >>> documentation good enough to allow another implementation. So >>> most likely the IB_QPT_SRD has to go away (this is a NAK). Driver >>> specific QP types can be implemented via driver udata. >> >> We can use driver specific QP type for now, perhaps in the future we'll be able >> to expose more detailed documentation about SRD. > > Without kverbs or rdma-core it is irrelevant what you call it. > >>> 3) We need to see the userspace for new drivers. A RDMA driver that >>> doesn't provide a useful rdma-core provider is deeply suspect as >>> not crossing the #1 threshold. >> >> We'll be publishing our libfabric userspace provider soon, I'll add >> a link once it's available. > > libfabric does not do the checks on the ABI construction that > rdma-core does, so that still has to be done somehow > > Jason >