On Fri, Jul 8, 2022 at 4:29 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Thu, Jul 07, 2022 at 12:30:03PM +0300, Oded Gabbay wrote: > > > > These limitations are not relevant to a deployment where all the NICs are > > > > Gaudi NICs, because we can use a single rkey for all MRs. > > > > > > Er, that is weird, did you mean to say you have only one MR per PD and > > > that it always has a fixed value? > > > Not exactly. We have multiple MRs per PD, but the driver assigns the > > same rkey (fixed value) for all created MRs. Our h/w matches the rkey > > with the one that is written in the QP. The rkey is not part of the actual > > MMU translation that is done inside our h/w. The MMU translation is > > done using the PD (we call it ASID - address space ID) and Address. > > I don't understand this at all - how can you have multiple MRs if > there is only one ASID per PD? The MR is logically the ASID since the > MR is the verbs model for MMU translation. We don't follow the MR verbs model. This is the meaning of the hardware constraint I wrote imo. Our MMU does a pgt walk that starts with ASID and then just goes according to the virtual address, same as regular CPU does. The key is not a part of the pgt. The ASID represents different processes, but because we decided long ago we support only a single user process, we only allocate a single ASID, which will translate to a single PD in our IBverb driver. > > So, if you have one ASID per PD and multiple MRs, what are the MRs > supposed to be? > > Jason Per my understanding, the MRs are meant to notify the driver that the user would like the h/w MMU to be familiar with these memory regions. As we also need to pin them, it is preferable to have multiple small MRs than a single very large MR. The fact that the key that is returned is the same for all memory regions shouldn't affect the user. Our MMU will be able to do the translation correctly using only the ASID+address. In addition, because we also have on-device memory (HBM), we would like to allow the user to register memory regions in that memory. So we need to support at least two MRs. Thanks, Oded