On Wed, Jul 6, 2022 at 7:24 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Wed, Jul 06, 2022 at 11:59:14AM +0300, Oded Gabbay wrote: > > > Due to that, we would want to put all the ports under a single struct ib_device, > > as you said it yourself in your original email a year ago. > > Yes > > > The major constraints are: > > > > 1. Support only RDMA WRITE operation. We do not support READ, SEND or RECV. > > This means that many existing open source tests in rdma-core are not > > compatible. e.g. rc_pingpong.c will not work. I guess we will need to > > implement different tests and submit them ? Do you have a > > different idea/suggestion ? > > I would suggest following what EFA did and just using your own unique > QP with dv accessors to create it. A QP that can only do RDMA WRITE is > not IBA compliant and shouldn't be created by a standard verbs call. > > > 2. As you mentioned in the original email, we support only a single PD. > > I don't see any major implication regarding this constraint but please > > correct me if you think otherwise. > > Seems fine > > > 3. MR limitation on the rkey that is received from the remote connection > > during connection creation. The limitation is that our h/w extracts > > the rkey from the QP h/w context and not from the WQE when sending packets. > > This means that we may associate only a single remote MR per QP. > > It seems OK in the context above where you have your own QP type and > obviouly your specila RDMA WRITE poster will not take in an rkey as > any argument. > > > Do you see any issue here with these two limitations ? One thing we noted is > > that we need to somehow configure the rkey in our h/w QP context, while today > > the API doesn't allow it. > > When you add your own dv qp create function it will take in the > required rkey during qp creation. > > > These limitations are not relevant to a deployment where all the NICs are > > Gaudi NICs, because we can use a single rkey for all MRs. > > Er, that is weird, did you mean to say you have only one MR per PD and > that it always has a fixed value? Not exactly. We have multiple MRs per PD, but the driver assigns the same rkey (fixed value) for all created MRs. Our h/w matches the rkey with the one that is written in the QP. The rkey is not part of the actual MMU translation that is done inside our h/w. The MMU translation is done using the PD (we call it ASID - address space ID) and Address. > > > 4. We do not support all the flags in the reg_mr API. e.g. we don't > > support IBV_ACCESS_LOCAL_WRITE. I'm not sure what the > > implication is here. > > It is OK, since you can't issue a local operation WQE anyhow you can > just ignore the flag. > > > 5. Our h/w contains several accelerations we would like to utilize. > > e.g. we have a h/w mechanism for accelerating collective operations > > on multiple RDMA NICs. These accelerations will require either extensions > > to current APIs, or some dedicated APIs. For example, one of the > > accelerations requires that the user will create a QP with the same > > index on all the Gaudi NICs. > > Use your DV interface to do these kinds of things Great! We will start to move forward using this approach. I imagine we will have something to show in a couple of months. Thanks, Oded > > Thanks, > Jason