On 2024-04-11 19:08:13, Trond Myklebust wrote: > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > > index cda0935a68c9..75faf1f05a14 100644 > > --- a/net/sunrpc/clnt.c > > +++ b/net/sunrpc/clnt.c > > @@ -669,6 +669,9 @@ static struct rpc_clnt *__rpc_clone_client(struct > > rpc_create_args *args, > > new->cl_chatty = clnt->cl_chatty; > > new->cl_principal = clnt->cl_principal; > > new->cl_max_connect = clnt->cl_max_connect; > > + new->cl_timeout = clnt->cl_timeout; > > + rpc_init_rtt(&clnt->cl_rtt_default, clnt->cl_timeout- > > >to_initval); > > + > > return new; > > > > That ends up clobbering any timeout value that gets set in the > rpc_create_args, and open codes something that is already being done in > the call to rpc_new_client(). > > IOW: I'd much rather see us default the value of args->timeout to that > of clnt->cl_timeout. Makes sense. I confirm the following fix works for NFSACL and preserves the current hard mount behavior for NSM. Other callers to `rpc_bind_new_program` are unaffected by this. However I am not sure if restoring this behavior is desired for the other call sites of `__rpc_clone_client`, which seems to be NFSv4.x users, so I left that part out. Should that be done in separate patches? >From 8715ef1d574970176e9ac87a7e826ad74f8b910d Mon Sep 17 00:00:00 2001 From: Dan Aloni <dan.aloni@xxxxxxxxxxxx> Date: Thu, 11 Apr 2024 18:30:56 +0300 Subject: [PATCH] sunrpc: fix NFSACL RPC retry on soft mount It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that all mount parameters propagate to NFSACL clients. However since that change, if mount options as follows are given: soft,timeo=50,retrans=16,vers=3 The resultant NFSACL client receives: cl_softrtry: 1 cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0 These values lead to NFSACL operations not being retried under the condition of transient network outages with soft mount. Instead, getacl call fails after 60 seconds with EIO. The simple fix is to pass the existing client's `cl_timeout` as the new client timeout. Cc: Chuck Lever <chuck.lever@xxxxxxxxxx> Cc: Benjamin Coddington <bcodding@xxxxxxxxxx> Cc: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@xxxxxxxxx/T/ Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()') Signed-off-by: Dan Aloni <dan.aloni@xxxxxxxxxxxx> --- net/sunrpc/clnt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index cda0935a68c9..07ffd4ee695a 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1068,6 +1068,7 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *old, .version = vers, .authflavor = old->cl_auth->au_flavor, .cred = old->cl_cred, + .timeout = old->cl_timeout, }; struct rpc_clnt *clnt; int err; -- 2.39.3 -- Dan Aloni