Re: [PATCH 0/9] Multiple network connections for a single NFS mount.

Chuck Lever <chuck.lever@xxxxxxxxxx> · Thu, 13 Jun 2019 12:27:12 -0400

> On Jun 12, 2019, at 7:37 PM, NeilBrown <neilb@xxxxxxxx> wrote:
> 
> On Wed, Jun 12 2019, Chuck Lever wrote:
> 
>> Hi Neil-
>> 
>> 
>>> On Jun 11, 2019, at 9:49 PM, NeilBrown <neilb@xxxxxxxx> wrote:
>>> 
>>> This enables NFS to benefit from transparent parallelism in the network
>>> stack, such as interface bonding and receive-side scaling as described
>>> earlier.
>>> 
>>> When multiple connections are available, NFS will send
>>> session-management requests on a single connection (the first
>>> connection opened)
>> 
>> Maybe you meant "lease management" requests?
> 
> Probably I do .... though maybe I can be forgiven for mistakenly
> thinking that CREATE_SESSION and DESTROY_SESSION could be described as
> "session management" :-)
> 
>> 
>> EXCHANGE_ID, RECLAIM_COMPLETE, CREATE_SESSION, DESTROY_SESSION
>> and DESTROY_CLIENTID will of course go over the main connection.
>> However, each connection will need to use BIND_CONN_TO_SESSION
>> to join an existing session. That's how the server knows
>> the additional connections are from a client instance it has
>> already recognized.
>> 
>> For NFSv4.0, SETCLIENTID, SETCLIENTID_CONFIRM, and RENEW
>> would go over the main connection (and those have nothing to do
>> with sessions).
> 
> Well.... they have nothing to do with NFSv4.1 Sessions.
> But it is useful to have a name for the collection of RPCs related to a
> particular negotiated clientid, and "session" (small 's') seems as good
> a name as any....

Lease management is the proper terminology, as it covers NFSv4.0
as well as NFSv4.1 and has been used for years to describe this
set of NFS operations. Overloading the word "session" is just going
to confuse things.

>> 3. RPC/RDMA clients always drop the connection before retransmitting
>> because they have to reset the connection's credit accounting.
>> 
>> 4. RPC/RDMA cannot depend on IP source port, because the RPC part
>> of the stack has no visibility into the choice of source port that
>> is chosen. Thus the server's DRC cannot use the source port. I
>> think server DRC's need to be prepared to deal with multiple client
>> connections.
> 
> OK, that could be an issue.

It isn't. The Linux NFS server computes a hash over the first ~200
bytes of each RPC call. We can safely ignore the client IP source
port and rely solely on that hash to sort the requests, thanks to
Jeff Layton.

My overall point is this descriptive text should ignore consideration
of IP source port in favor of describing the creation of multiple
flows of requests.

> Linux uses an independent xid sequence for each xprt, so two separate
> xprts can easily use the same xid for different requests.
> If RDMA cannot see the source port, it might depend more on the xid and
> so risk getting confused.
> 
> There was a patch floating around which reserved a few bits of the xid
> for an xprt index to ensure all xids were unique, but Trond didn't like
> sub-dividing the xid space (which is fair enough).
> So maybe it isn't safe to use nconnect with RDMA and protocol versions
> earlier than 4.1.

--
Chuck Lever