Re: mount options not propagating to NFSACL and NSM RPC clients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2024-04-11 at 19:36 +0300, Dan Aloni wrote:
> On 2024-04-11 11:20:14, Benjamin Coddington wrote:
> > > So looks like the request is not being retransmitted. Just to be
> > > sure,
> > > if I cause the nfsd to drop the regular NFS3 prog I/Os like
> > > ACCESS and
> > > LOOKUP, I only get the expected 5 seconds delay following a
> > > successful
> > > retry.
> > > 
> > > Seems we only have an issue with the NFS3ACL prog.
> > 
> > It looks like the client_acl program gets created with
> > rpc_bind_new_program() which doesn't setup the timeouts/retry
> > strategy, and
> > there's nothing after the setup to do it either.
> > 
> > I think the problem has existed since 331702337f2b2..  I think this
> > should
> > fix it up, would you like to test it?
> 
> Please allow me to propose a different change, which I already
> tested.
> 
> Looks to me that the 2012 change that I mentioned below is actually
> when
> the problem started happening, when the `kmemdup` call was removed,
> so
> since then we are simply just missing a field copy, and we are taking
> the timeout from transport-based default timeout globals instead.
> 
> Also, I have a hunch that we need to do something different, because
> of
> the following code in `rpc_new_client`:
> 
>     clnt->cl_rtt = &clnt->cl_rtt_default;
>     rpc_init_rtt(&clnt->cl_rtt_default, clnt->cl_timeout-
> >to_initval);
> 
> Only setting `clnt->cl_timeout` _after_ client clone does not seem to
> be
> right if there are `cl_rtt_default` calculations that depend on it.
> So
> may as well need to call `rpc_init_rtt` too.
> 
> --
> 
> From 55737f82a9bb3e490836d10491995c8082ebcf11 Mon Sep 17 00:00:00
> 2001
> From: Dan Aloni <dan.aloni@xxxxxxxxxxxx>
> Date: Thu, 11 Apr 2024 18:30:56 +0300
> Subject: [PATCH] sunrpc: fix NFSACL RPC retry on soft mount
> 
> It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor
> rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so
> that
> all mount parameters propagate to NFSACL clients. However since that
> change, if mount options as follows are given:
> 
>     soft,timeo=50,retrans=16,vers=3
> 
> The resultant NFSACL client receives:
> 
>     cl_softrtry: 1
>     cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0,
> to_retries=2, to_exponential=0
> 
> These values lead to NFSACL operations not being retried under the
> condition of transient network outages with soft mount. Instead,
> getacl
> call fails after 60 seconds with EIO.
> 
> The simple fix is to copy `cl_timeout` and make sure `cl_rtt_default`
> is initialized from it.
> 
> Cc: Chuck Lever <chuck.lever@xxxxxxxxxx>
> Cc: Benjamin Coddington <bcodding@xxxxxxxxxx>
> Link:
> https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@xxxxxxxxx/T/
> Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()')
> Signed-off-by: Dan Aloni <dan.aloni@xxxxxxxxxxxx>
> ---
>  net/sunrpc/clnt.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index cda0935a68c9..75faf1f05a14 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -669,6 +669,9 @@ static struct rpc_clnt *__rpc_clone_client(struct
> rpc_create_args *args,
>  	new->cl_chatty = clnt->cl_chatty;
>  	new->cl_principal = clnt->cl_principal;
>  	new->cl_max_connect = clnt->cl_max_connect;
> +	new->cl_timeout = clnt->cl_timeout;
> +	rpc_init_rtt(&clnt->cl_rtt_default, clnt->cl_timeout-
> >to_initval);
> +
>  	return new;
> 

That ends up clobbering any timeout value that gets set in the
rpc_create_args, and open codes something that is already being done in
the call to rpc_new_client().

IOW: I'd much rather see us default the value of args->timeout to that
of clnt->cl_timeout.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux