Re: Question on RPC_TASK_NO_RETRANS_TIMEOUT / NFS_CS_NO_RETRANS_TIMEOUT for NFSv3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/25/23 2:49 PM, Trond Myklebust wrote:

NFSv3 servers are allowed to drop requests, and NFSv3 clients are
expected to retransmit them when this happens. NFSv4 servers may
not
drop requests, and NFSv4 clients are expected never to retransmit
(unless the connection breaks). For that reason we do set
RPC_TASK_NO_RETRANS_TIMEOUT on NFSv4 and do not on NFSv3.

We have been doing a bunch of debugging on this issue and the key
point / problem we are
running into is that because this is a kerberos enabled mount when
the client does a
re-transmit it ends up generating a new MIC header / checksum since
the krb5 context
sequence number has moved on.

If that retrans happens before the original response is received then
the mic verification
fails since the client is now expecting a response to the second
packet and not the first.
mic header verification fails which then results in an EACCES error
which ends up as an IO
error at the application.

What we have found that is it easy to repro in our environment adding
an iptables
rule to drop responses from the nfs server for 55-63 seconds.
Less than 55 sec and the retrans does not happen things recover
More than 63 sec and the rpc code goes down the reconnect path before
doing the retrans and
things recover.

It seems like kerberos enabled mounts should be using
RPC_TASK_NO_RETRANS_TIMEOUT since doing
a retrans changes the GSS checksum from the original checksum.


No, that is not an option. NFSv3 servers are allowed to drop any
incoming RPC request without needing a reason, so turning on
RPC_TASK_NO_RETRANS_TIMEOUT would just lead to client hangs.
I can see that for UDP but is that true for TCP as well?
Wouldn't the rpc code behave the same as v4 and setup a new connection
before doing the retrans?
At least in our experimentation if we leave the connection down for more 63 seconds we can see from the rpc traces that is what is happening. Once there is a new connection then old message is ignored and processing continues with the new set request / responses.


The right thing to do is to just fix up rpc_decode_header() to retry
instead of firing off an error in this case.
So you are thinking that rpc_decode_header just returns EAGAIN if the checksum fails? What happens if the GSS context actually goes bad (times out etc) wouldn't that also result in the client get stuck just doing re-sends
over and over?

I'm really not that up to speed on subtleties of NFS kerberos.

Oh note this isn't even krb5p just krb5 mounts. (not that should matter all that much)

--Russell Cattelan



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux