> On Nov 24, 2015, at 5:59 AM, Sagi Grimberg <sagig@xxxxxxxxxxxxxxxxxx> wrote: > > > > On 24/11/2015 08:45, Christoph Hellwig wrote: >> On Mon, Nov 23, 2015 at 05:14:14PM -0500, Chuck Lever wrote: >>> In the current xprtrdma implementation, some memreg strategies >>> implement ro_unmap synchronously (the MR is knocked down before the >>> method returns) and some asynchonously (the MR will be knocked down >>> and returned to the pool in the background). >>> >>> To guarantee the MR is truly invalid before the RPC consumer is >>> allowed to resume execution, we need an unmap method that is >>> always synchronous, invoked from the RPC/RDMA reply handler. >>> >>> The new method unmaps all MRs for an RPC. The existing ro_unmap >>> method unmaps only one MR at a time. >> >> Do we really want to go down that road? It seems like we've decided >> in general that while the protocol specs say MR must be unmapped before >> proceeding with the data that is painful enough to ignore this >> requirement. E.g. iser for example only does the local invalidate >> just before reusing the MR. That leaves the MR exposed to the remote indefinitely. If the MR is registered for remote write, that seems hazardous. > It is painful, too painful. The entire value proposition of RDMA is > low-latency and waiting for the extra HW round-trip for a local > invalidation to complete is unacceptable, moreover it adds a huge loads > of extra interrupts and cache-line pollutions. The killer is the extra context switches, I’ve found. > As I see it, if we don't wait for local-invalidate to complete before > unmap and IO completion (and no one does) then local invalidate before > re-use is only marginally worse. For iSER, remote invalidate solves this (patches submitted!) and I'd say we should push for all the > storage standards to include remote invalidate. I agree: the right answer is to use remote invalidation, and to ensure the order is always: 1. invalidate the MR 2. unmap the MR 3. wake up the consumer And that is exactly my strategy for NFS/RDMA. I don’t have a choice: as Tom observed yesterday, krb5i is meaningless unless the integrity of the data is guaranteed by fencing the server before the client performs checksumming. I expect the same is true for T10-PI. > There is the question > of multi-rkey transactions, which is why I stated in the past that > arbitrary sg registration is important (which will be submitted soon > for ConnectX-4). > > Waiting for local invalidate to complete would be a really big > sacrifice for our storage ULPs. I’ve noticed only a marginal loss of performance on modern hardware. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html