Re: [PATCH RFC 0/5] xprtrdma Send completion batching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Sep 6, 2017, at 11:23 AM, Sagi Grimberg <sagi@xxxxxxxxxxx> wrote:
> 
> 
>>> I see, but how can the user know that that it needs to use RPCSEC_GSS
>>> otherwise nfs/rdma might compromise sensitive data? And is this
>>> a valid constraint? (just asking, you're the expert)
>> sec=krb5p is used in cases where data on the wire must remain
>> confidential. Otherwise, sensitive or no, data on the wire goes
>> in the clear.
>> But an administrator might not expect that other sensitive data
>> on the client (not involved with NFS) can be placed on the wire
>> by the vagaries of memory allocation and hardware retransmission,
>> as exceptionally rare as that might be.
>> Memory in which Send data resides is donated to the device until
>> the Send completion fires: the ULP has no way to get it back in
>> the meantime. ULPs can invalidate memory used for RDMA Read at
>> any time, but Send memory is registered with the local DMA key
>> (as anything else is just as expensive as an RDMA data transfer).

>> The immediate solution is to never use Send to move file data
>> directly. It will always have to be copied into a buffer or
>> we use RDMA Read. These buffers contain only data that is
>> destined for the wire. Does that close the unwanted exposure
>> completely?
> 
> It would, but is that a smaller sacrifice than signaling
> send completions for small writes?

Recall that if there's no file data, the transport will
utilize a persistently registered and DMA mapped buffer
that it owns in which to build the RPC Call message and
post the Send.

If there is file data, the same buffer is used, but the
memory containing the file data is DMA mapped and added
to the Send SGE list.

With sendctx, every 16th Send [*] is signaled whether it
is carrying extra SGEs that need to be unmapped, or not.
All other Sends are not signaled. This guarantees correct
Send Queue accounting for all workloads and registration
modes, using a minimum number of completions.

During each Send completion, the handler walks through
SGEs since the last completion, and unmaps them if needed.

If we choose never to do scatter-gather Send with file
data, then this last step is unneeded because then only
persistently registered and mapped buffers would be used
for sending RPC Call messages.

But note that either mechanism results in the same Send
completion rate.


[*] 16 is adjusted down to accommodate smaller Send
Queues as needed.

>> If the HCA can guarantee that all Sends complete quickly (either
>> successful, flush, or time out after a few seconds) then it could
>> be fair to make RPC completion also wait for Send completion.
>> Otherwise, a ^C on a file operation targeting an unreachable
>> server will hang indefinitely.
> 
> You could set retry_count=0/1 which will fail with zero or one
> send retries (a matter of seconds), but that would make the qp go to
> error state which is probably not what we want...

I'm told that not letting the hardware try as hard as it
can to transmit a Send is asking for data corruption. Thus
the current setting is 6. That should cause a time out in
less than a minute? It depends on the HCA I guess.

Dropping the connection is desirable to force a full
reconnect (with address resolution) and to kick off
another Send. It is not desirable because it will also
interrupt all other outstanding RPCs on that connection.

As I see it, the options are to apply sendctx (this series),
and then:

A. Remove the post-v4.6 scatter-gather code, or

B. Force RPC completion to wait for Send completion, which
would allow the post-v4.6 scatter-gather code to work
safely. This would need some guarantee that Sends will
always complete in a short period.

For B, the signaling scheme would have to be: signal
non-data-bearing Send every so often, but signal all
data-bearing Sends. RPC completion would have to be able
to tell the difference and wait as needed. I can probably
handle this by adding a couple of atomic bits in struct
rpcrdma_req.

A. seems like the more straightforward approach.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux