On Mon, Jul 10, 2017 at 04:51:20PM -0400, Chuck Lever wrote: > > > On Jul 10, 2017, at 4:05 PM, Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Mon, Jul 10, 2017 at 03:03:18PM -0400, Chuck Lever wrote: > > > >> One option is to somehow split the Send-related data structures from > >> rpcrdma_req, and manage them independently. I've already done that for > >> MRs: MR state is now located in rpcrdma_mw. > > > > Yes, this is is what I was implying.. Track the SQE related stuff > > seperately in memory allocated during SQ setup - MR, dma maps, etc. > > > No need for an atomic/lock then, right? The required memory is bounded > > since the inline send depth is bounded. > > Perhaps I lack some imagination, but I don't see how I can manage > these small objects without a serialized free list or circular > array that would be accessed in the forward path and also in a > Send completion handler. I don't get it, dma unmap can only ever happen in the send completion handler, it can never happen in the forward path. (this is the whole point of this thread) Since you are not using send completion today you can just use the wr_id to point to the pre-allocated memory containing the pages to invalidate. Completely remove dma unmap from the forward path. Usually I work things out so that the meta-data array is a ring and every SQE post consumes a meta-data entry. Then occasionally I signal completion and provide a wr_id of the latest ring index and the completion handler runs through all the accumulated meta-data and acts on it (eg unmaps/etc). This approach still allows batching completions. Since ring entries are bounded size we just preallocate the largest size at QP creation. In this case it is some multiple of the number of inline send pages * number of SQE entries. > This seems like a lot of overhead to deal with a very uncommon > case. I can reduce this overhead by signaling only Sends that > need to unmap page cache pages, but still. Yes, but it is not avoidable.. > As we previously discussed, xprtrdma does SQ accounting using RPC > completion as the gate. Basically xprtrdma will send another RPC > as soon as a previous one is terminated. If the Send WR is still > running when the RPC terminates, I can potentially overrun the > Send Queue. Makes sense. The SQ accounting must be precise. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html