Re: [PATCH v2 2/4] NFSD: Add READ_PLUS support for data segments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Feb 6, 2015, at 3:28 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:

> On Fri, Feb 06, 2015 at 03:07:08PM -0500, Chuck Lever wrote:
>> 
>> On Feb 6, 2015, at 2:35 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>>>> 
>>>> Small replies are sent inline. There is a size maximum for inline
>>>> messages, however. I guess 5667 section 5 assumes this context, which
>>>> appears throughout RFC 5666.
>>>> 
>>>> If an expected reply exceeds the inline size, then a client will
>>>> set up a reply list for the server. A memory region on the client is
>>>> registered as a target for RDMA WRITE operations, and the co-ordinates
>>>> of that region are sent to the server in the RPC call.
>>>> 
>>>> If the server finds the reply will indeed be larger than the inline
>>>> maximum, it plants the reply in the client memory region described by
>>>> the request’s reply list, and repeats the co-ordinates of that region
>>>> back to the client in the RPC reply.
>>>> 
>>>> A server may also choose to send a small reply inline, even if the
>>>> client provided a reply list. In that case, the server does not
>>>> repeat the reply list in the reply, and the full reply appears
>>>> inline.
>>>> 
>>>> Linux registers part of the RPC reply buffer for the reply list. After
>>>> it is received on the client, the reply payload is copied by the client
>>>> CPU to its final destination.
>>>> 
>>>> Inline and reply list are the mechanisms used when the upper layer
>>>> has some processing to do to the incoming data (eg READDIR). When
>>>> a request just needs raw data to be simply dropped off in the client’s
>>>> memory, then the write list is preferred. A write list is basically a
>>>> zero-copy I/O.
>>> 
>>> The term "reply list" doesn't appear in either RFC.  I believe you mean
>>> "client-posted write list" in most of the above, except this last
>>> paragraph, which should have started with "Inline and server-posted read list...”  ?
>> 
>> No, I meant “reply list.” Definitely not read list.
>> 
>> The terms used in the RFCs and the implementations vary,
> 
> OK.  Would you mind defining the term "reply list" for me?  Google's not helping.

Let’s look at section 4.3 of RFC 5666. Each RPC/RDMA header begins
with this:
 
      struct rdma_msg {
         uint32    rdma_xid;     /* Mirrors the RPC header xid */
         uint32    rdma_vers;    /* Version of this protocol */
         uint32    rdma_credit;  /* Buffers requested/granted */
         rdma_body rdma_body;
      };

rdma_body starts with a uint32 which discriminates a union:

      union rdma_body switch (rdma_proc proc) {
. . .
         case RDMA_NOMSG:
           rpc_rdma_header_nomsg rdma_nomsg;
. . .
      };

When “proc” == RDMA_NOMSG, rdma_body is made up of three lists:

      struct rpc_rdma_header_nomsg {
         struct xdr_read_list   *rdma_reads;
         struct xdr_write_list  *rdma_writes;
         struct xdr_write_chunk *rdma_reply;
      };

The “reply list” is that last part: rdma_reply, which is a counted
array of xdr_rdma_segment’s.

Large replies for non-NFS READ operations are sent using RDMA_NOMSG.
The RPC/RDMA header is sent as the inline portion of the message.
The RPC reply message (the part we are all familiar with) is planted
in the memory region described by rdma_reply, it’s not inline.

rdma_reply is a write chunk. The server WRITEs its RPC reply into the
memory region described by rdma_reply. That description was provided
by the client in the matching RPC call message.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux