Re: questions about the linux NFS 4.1 client and persistent sessions

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Sat, 17 Oct 2020 17:14:03 -0400

On Sat, Oct 17, 2020 at 11:40:09PM +0300, Guy Keren wrote:
> according to what you wrote here, an NFS4ERR_DELAY response is
> something that needs to be sent at the level of the entire compound
> request - i.e. the server is not allowed to send a compound response
> where the first few requests have a status of NFS4_OK, while the last
> have a status of NFS4ERR_DELAY.

Oh, no, it's absolutely fine for a server to do that.

Sorry, you mentioned persistent sessions, so I assumed somehow this was
about retries after crashes or reboots, where the client may not have
received the reply and doesn't know whether it executed.

> according to what you say, if the OPEN request is in the middle of the
> compound request, and is preceded by state-modifying requests (e.g.
> creation of other files, writes into other open handles, renames,
> etc.), then the server must avoid processing them until it recalled
> the delegation to the file (i.e. it must process the entire command to
> make sure it doesn't need to send an NFS4ERR_DELAY response due to any
> of the requests inside it, before it starts processing, and it must
> also lock the state of all files involved in the request, to avoid
> another client acquiring a delegation on any of the files in the
> request that have an OPEN request in the same compound. alternatively,
> it must not send an NFS4ERR_DELAY request, and instead just keep the
> request pending until the delegation recall was completed.

No, sorry for the confusion, you're correct, if the client had a bunch
of non-idempotent ops all in one compound, and got a DELAY partway
through, then, yes, it would have to deal with retrying only the part
that didn't execute.

I don't know of any client that actually does that, for what it's worth.
The Linux client, for example, doesn't send any compounds that I can
think of that have more than one nonidempotent op.

> i would assume that the same mechanism used to create the compound
> request in the first place (adding the PUTFH in front, etc.) could be
> used during a re-building of a smaller compound request - provided
> that the client knows which requests from the compound were already
> completed - and which were not.
> 
> but i understand that there's no such mechanism today on the linux NFS
> client kernel - which is what i initially asked - so that clarifies
> things.

Right, in theory you could imagine clients doing very general things
with compounds.  In practice I don't know of any that do.

(Not that that allows a spec-compliant server to assume they won't.)

> what about a situation in which instead of a server restart event, the
> client just disconnected before receiving a rename response, and
> re-connected with the same session to the same session? in that case,
> i presume that the Linux NFS client will re-send the compound request,
> and get the results from the server's Duplicate-Request cache, without
> returning errors to the application. correct?

Right, assuming the client managed to hang on to its lease.

> and this doesn't answer the original question: how was the "persistent
> sessions" support in the linux NFS 4.1 client tested?

I don't know, sorry.

> on an aside - i see that you are also the maintainer of the pynfs test
> suite. would you be interested in patches fixing its install
> operation, and if yes - should we send them to this mailing list, or
> directly to you? i failed to find a mailing list dedicated to pynfs
> development.

Just send them to me, cc'd to this list.  Thanks!

--b.