Re: v3 timeout behavior

Olga Kornievskaia <aglo@xxxxxxxxx> · Wed, 17 Jun 2020 13:20:02 -0400

On Wed, Jun 17, 2020 at 12:04 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>
> Hi folks,
>
> I have a question whether or not the current client's behaviour is
> desirable. Current behaviour: every time a v3 operation is re-sent to
> the server we update (double) the timeout. There is no distinction
> between whether or not the previous timer had expired before the
> re-sent happened.
>
> Here's the scenario:
> 1. Client sends a v3 operation
> 2. Server RST-s the connection (prior to the timeout) (eg., connection
> is immediately reset)
> 3. Client re-sends a v3 operation but the timeout is now 120sec.
>
> As a result, an application sees 2mins pause. Where as if a connection
> reset didn't change the timeout value, the client would have re-tried
> (the 3rd time) after 60secs.
>
> Question: so in sunrcp if we get errors CONNREST/CONNABORTED, should
> we skip adjusting the timeout?

This is what I have in mind:
aglo@localhost linux-nfs]$ git diff

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 61b21daf..26be473 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2413,7 +2413,8 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
                goto out_exit;
        }
        task->tk_action = call_encode;
-       rpc_check_timeout(task);
+       if (status != -ECONNRESET && status != -ECONNABORTED)
+               rpc_check_timeout(task);
        return;
 out_exit:
        rpc_call_rpcerror(task, status);