Re: [PATCH v1 13/16] NFS: Add sidecar RPC client support

Chuck Lever <chuck.lever@xxxxxxxxxx> · Tue, 21 Oct 2014 13:11:26 -0400

On Oct 21, 2014, at 3:45 AM, Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote:

> On Tue, Oct 21, 2014 at 4:06 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>> 
>> There is no show-stopper (see Section 5.1, after all). It’s
>> simply a matter of development effort: a side-car is much
>> less work than implementing full RDMA backchannel support for
>> both a client and server, especially since TCP backchannel
>> already works and can be used immediately.
>> 
>> Also, no problem with eventually implementing RDMA backchannel
>> if the complexity, and any performance overhead it introduces in
>> the forward channel, can be justified. The client can use the
>> CREATE_SESSION flags to detect what a server supports.
> 
> What complexity and performance overhead does it introduce in the
> forward channel?

The benefit of RDMA is that there are opportunities to
reduce host CPU interaction with incoming data.
Bi-direction requires that the transport look at the RPC
header to determine the direction of the message. That
could have an impact on the forward channel, but it’s
never been measured, to my knowledge.

The reason this is more of an issue for RPC/RDMA is that
a copy of the XID appears in the RPC/RDMA header to avoid
the need to look at the RPC header. That’s typically what
implementations use to steer RPC reply processing.

Often the RPC/RDMA header and RPC header land in
disparate buffers. The RPC/RDMA reply handler looks
strictly at the RPC/RDMA header, and runs in a tasklet
usually on a different CPU. Adding bi-direction would mean
the transport would have to peek into the upper layer
headers, possibly resulting in cache line bouncing.

The complexity would be the addition of over a hundred
new lines of code on the client, and possibly a similar
amount of new code on the server. Small, perhaps, but
not insignificant.

>>> 2) Why do we instead have to solve the whole backchannel problem in
>>> the NFSv4.1 layer, and where is the discussion of the merits for and
>>> against that particular solution? As far as I can tell, it imposes at
>>> least 2 extra requirements:
>>> a) NFSv4.1 client+server must have support either for session
>>> trunking or for clientid trunking
>> 
>> Very minimal trunking support. The only operation allowed on
>> the TCP side-car's forward channel is BIND_CONN_TO_SESSION.
>> 
>> Bruce told me that associating multiple transports to a
>> clientid/session should not be an issue for his server (his
>> words were “if that doesn’t work, it’s a bug”).
>> 
>> Would this restrictive form of trunking present a problem?
>> 
>>> b) NFSv4.1 client must be able to set up a TCP connection to the
>>> server (that can be session/clientid trunked with the existing RDMA
>>> channel)
>> 
>> Also very minimal changes. The changes are already done,
>> posted in v1 of this patch series.
> 
> I'm not asking for details on the size of the changesets, but for a
> justification of the design itself.

The size of the changeset _is_ the justification. It’s
a much less invasive change to add a TCP side-car than
it is to implement RDMA backchannel on both server and
client.

Most servers would require almost no change. Linux needs
only a bug fix or two. Effectively zero-impact for
servers that already support NFSv4.0 on RDMA to get
NFSv4.1 and pNFS on RDMA, with working callbacks.

That’s really all there is to it. It’s almost entirely a
practical consideration: we have the infrastructure and
can make it work in just a few lines of code.

> If it is possible to confine all
> the changes to the RPC/RDMA layer, then why consider patches that
> change the NFSv4.1 layer at all?

The fast new transport bring-up benefit is probably the
biggest win. A TCP side-car makes bringing up any new
transport implementation simpler.

And, RPC/RDMA offers zero performance benefit for
backchannel traffic, especially since CB traffic would
never move via RDMA READ/WRITE (as per RFC 5667 section
5.1).

The primary benefit to doing an RPC/RDMA-only solution
is that there is no upper layer impact. Is that a design
requirement?

There’s also been no discussion of issues with adding a
very restricted amount of transport trunking. Can you
elaborate on the problems this could introduce?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html