> On Jun 7, 2016, at 5:28 PM, Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, Jun 07, 2016 at 05:09:47PM -0400, Chuck Lever wrote: > >>> Either the number in queue is limited by the invalidation >>> synchronization point to 2, or in the case of iwarp read, it doesn't >>> matter since everything executes serially in order and the value can >>> safely wrap. >> >> I don't understand this. How does invalidation prevent >> the ULP from registering more R_keys? > > The ULP doesn't 'register' rkeys, it only changes the active RKey assigned > to a MR. The number of active rkeys is strictly limited to the number > of available MRs. I'm sure I'm using the language poorly. > Further changing the rkey can only be done as part of an invalidate > cycle. OK, I'm not remembering something very clearly, and I've drawn an incorrect conclusion. Yes, there always has to be an invalidation before an R_key can be reused to register a new MR. > Invaliding a MR/RKey can only happen after synchronization with the > remote. > > So the work queue looks broadly like this: > > WR setup rkey 1 > WR send rkey 1 to remote > <synchronize> > WR invalidate rkey 1 > WR setup rkey 2 > WR send rkey 2 to remote > <synchronize> > > Thus the WR queue can never have more than 2 rkey's per MR in it at > any time, and there is no need to care about the 24/8 bit split. Agreed, I see that the split has nothing to do with it. I will drop this patch. >> And, I'd like to limit the number of pre-allocated MRs >> to no more than each transport endpoint can use. For >> xprtrdma, a transport is one QP and PD. What is that >> number? > > I'm not sure I understand this question. AFAIK there is no limit on > MRs that can be used with a QP. As long as the MR is allocated you can > use it. There is a practical number of MRs that can be allocated per device, I thought. And each MR consumes some amount of host memory. xprtrdma is happy to allocate thousands of MRs per QP/PD pair, but that isn't practical as you start adding more transports/connections. The question has to do with scaling the number of xprt connections over available device resources. > Typically you'd allocate enough MRs to handle your typical case > concurrancy level (pipeline depth). > > The concurrancy level is limited by lots of things, for instance the > max number of posted recv's typically places a hard upper limit. xprtrdma already makes an estimate. I'm wondering if its still a valid one. Fewer MRs means better scaling in number of transports. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html