RE: Trying to reduce NFSv4 timeouts to a few seconds on an established connection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Jeff Layton <jlayton@xxxxxxxxxx>
> Sent: Tuesday, February 28, 2023 8:24 AM
> 
> On Mon, 2023-02-27 at 14:48 +0000, Andrew Klaassen wrote:
> > > From: Andrew Klaassen <andrew.klaassen@xxxxxxxxxxxxxx>
> > > Sent: Monday, February 6, 2023 12:19 PM
> > >
> > > > From: Andrew Klaassen <andrew.klaassen@xxxxxxxxxxxxxx>
> > > > Sent: Monday, February 6, 2023 10:28 AM
> > > >
> > >
> > > > [snipping for readability; hope that's okay]
> > > >
> > > >  - I'm allocating memory.  I assume that means I should free it
> > > > somewhere.
> > > > But where?  In xprt_destroy(), which appears to do cleanup?  Or in
> > > > xprt_destroy_cb(), which is called from xprt_destroy() and which
> > > > frees
> > > > xprt-
> > > > > servername?  Or somewhere else completely?
> > > >  - If I free the allocated memory, will that cause any problems in
> > > > the cases where no timeout is passed in via the args and the
> > > > static const struct xs_tcp_default_timeout is assigned to
> > > > xprt->timeout?
> > > >  - If freeing the static const struct default will cause a
> > > > problem, what should I do instead?  Allocate and memcpy even when
> > > > assigning the default?  And would that mean doing the same thing
> > > > for all the other transports that are setting timeouts (local,
> > > > udp, tcp, and bc_tcp)?
> > >
> > > [snipping more]
> >
> > Here's the patch in what I hope is its final form.  I'm planning to
> > test it on a couple of hundred nodes over the next month or two.
> >
> > Since I'm completely new to this, what would be the chances of
> > actually getting this patch in the kernel?
> >
> > Thanks.
> >
> > Andrew
> >
> 
> Excellent work! I'll be interested to hear how the testing goes!
> 
> 
> This patch still needs a bit of work. I'd consider this a proof-of- concept. You
> are at least demonstrating the problem with this patch (and a possible
> solution).
> 
> Conceptually, it's not 100% clear to me that we want the exact same timeout
> on the RPC call and the xprt. We might, but working with interlocking
> timeouts can bring in emergent behavioral changes and I haven't thought
> through these.

At this point I'll admit that I don't fully understand the difference between those two, so I expect that your thoughts on it will be more relevant than mine.  :-)  Happy to get more of your feedback on it.  (I do notice that the patch causes timeouts during an initial mount that are twice as long as expected, so I assume that this is related to the two separate things you're talking about.)

Not knowing much, my initial guess is that the solution might be from options like:

 - Create a system-wide tuneable for xs_[local|udp|tcp]_default_timeout.  In our case that's less-than-ideal, since we want to change the total timeout for an NFS mount on a per-server or per-mount basis rather than a system-wide basis.

 - Add a second set of timeout options to NFS so that RPC call and xprt timeouts can be specified separately.  I'm guessing no-one is enthusiastic about option bloat, even if this would be the theoretically cleanest option.

 - Use timeo and retrans for the RPC call timeout, and retry for the xprt timeout.  Or do the opposite.  The NFS manpage describes the current behaviour incorrectly, so this at least wouldn't make the documentation any worse.


> > From caa3308a3bcf39eb95d9b59e63bd96361e98305e Mon Sep 17 00:00:00
> 2001
> > From: Andrew Klaassen <andrew.klaassen@xxxxxxxxxxxxxx>
> > Date: Fri, 10 Feb 2023 10:37:57 -0500
> > Subject: [PATCH] Sun RPC: Use passed-in timeouts if available instead
> > of  always using defaults.
> >
> 
> This needs a real patch description. Describe the problem you were
> having, and how this patch changes things to address it. Make sure you
> add a Signed-off-by line too.
> 
> When you resend, send it to the the nfs client maintainers (Trond and
> Anna) using git-format-patch and git-send-email, and cc linux-nfs list.
> I think your MUA might have mangled the patch a bit. Please look over
> Documentation/process/submitting-patches.rst in the kernel source tree
> too.

Thanks for the tips.  I did read submitting-patches.rst, but obviously not carefully enough.  :-)  Would it be appropriate to submit the patch as-is (with check-patch.pl fixes), or should any potential interlocking-timeout issues be addressed first?


> > ---
> >  include/linux/sunrpc/xprt.h |  3 +++
> >  net/sunrpc/clnt.c           |  1 +
> >  net/sunrpc/xprt.c           | 21 +++++++++++++++++++++
> >  net/sunrpc/xprtsock.c       | 22 +++++++++++++++++++---
> >  4 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> > index b9f59aabee53..ca7be090cf83 100644
> > --- a/include/linux/sunrpc/xprt.h
> > +++ b/include/linux/sunrpc/xprt.h
> > @@ -333,6 +333,7 @@ struct xprt_create {
> >       struct svc_xprt         *bc_xprt;       /* NFSv4.1
> > backchannel */
> >       struct rpc_xprt_switch  *bc_xps;
> >       unsigned int            flags;
> > +     const struct rpc_timeout *timeout;      /* timeout parms */
> >  };
> >
> >  struct xprt_class {
> > @@ -373,6 +374,8 @@
> > void                  xprt_release_xprt_cong(struct rpc_xprt *xprt, struct rpc_task
> *task);
> >  void                 xprt_release(struct rpc_task *task);
> >  struct rpc_xprt *    xprt_get(struct rpc_xprt *xprt);
> >  void                 xprt_put(struct rpc_xprt *xprt);
> > +struct rpc_timeout   *xprt_alloc_timeout(const struct rpc_timeout
> > *timeo,
> > +                             const struct rpc_timeout
> > *default_timeo);
> >  struct rpc_xprt *    xprt_alloc(struct net *net, size_t size,
> >                               unsigned int num_prealloc,
> >                               unsigned int max_req);
> > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> > index 0b0b9f1eed46..1350c1f489f7 100644
> > --- a/net/sunrpc/clnt.c
> > +++ b/net/sunrpc/clnt.c
> > @@ -532,6 +532,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args
> > *args)
> >               .addrlen = args->addrsize,
> >               .servername = args->servername,
> >               .bc_xprt = args->bc_xprt,
> > +             .timeout = args->timeout,
> >       };
> >       char servername[48];
> >       struct rpc_clnt *clnt;
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > index ab453ede54f0..0bb800c90976 100644
> > --- a/net/sunrpc/xprt.c
> > +++ b/net/sunrpc/xprt.c
> > @@ -1801,6 +1801,26 @@ static void xprt_free_id(struct rpc_xprt *xprt)
> >       ida_free(&rpc_xprt_ids, xprt->id);
> >  }
> >
> > +struct rpc_timeout *xprt_alloc_timeout(const struct rpc_timeout
> > *timeo,
> > +                                    const struct rpc_timeout
> > *default_timeo)
> > +{
> > +     struct rpc_timeout *timeout;
> > +
> > +     timeout = kzalloc(sizeof(*timeout), GFP_KERNEL);
> > +     if (!timeout)
> > +             return ERR_PTR(-ENOMEM);
> > +     if (timeo)
> > +             memcpy(timeout, timeo, sizeof(struct rpc_timeout));
> > +     else
> > +             memcpy(timeout, default_timeo, sizeof(struct
> > rpc_timeout));
> 
> I don't think you need an allocation here. struct rpc_timeout is quite
> small and it only contains a bunch of integers. I think it'd be better
> to just embed this in struct rpc_xprt instead.
> 
> > +     return timeout;
> > +}
> > +
> > +static void xprt_free_timeout(struct rpc_xprt *xprt)
> > +{
> > +     kfree(xprt->timeout);
> > +}
> > +
> >  struct rpc_xprt *xprt_alloc(struct net *net, size_t size,
> >               unsigned int num_prealloc,
> >               unsigned int max_alloc)
> > @@ -1837,6 +1857,7 @@ EXPORT_SYMBOL_GPL(xprt_alloc);
> >
> >  void xprt_free(struct rpc_xprt *xprt)
> >  {
> > +     xprt_free_timeout(xprt);
> >       put_net_track(xprt->xprt_net, &xprt->ns_tracker);
> >       xprt_free_all_slots(xprt);
> >       xprt_free_id(xprt);
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index aaa5b2741b79..13703f8e0ef1 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -2924,7 +2924,12 @@ static struct rpc_xprt *xs_setup_udp(struct
> > xprt_create *args)
> >
> >       xprt->ops = &xs_udp_ops;
> >
> > -     xprt->timeout = &xs_udp_default_timeout;
> > +     xprt->timeout = xprt_alloc_timeout(args->timeout,
> > &xs_udp_default_timeout);
> > +     if (IS_ERR(xprt->timeout))
> > +     {
> 
> Kernel coding style has the brackets on the same line as the "if"
> statement. You should run your next iteration through checkpatch.pl.

Thanks.  I did that once, but forgot to do it again before sending this version.


> > +             ret = ERR_CAST(xprt->timeout);
> > +             goto out_err;
> > +     }
> >
> >       INIT_WORK(&transport->recv_worker,
> > xs_udp_data_receive_workfn);
> >       INIT_WORK(&transport->error_worker, xs_error_handle);
> > @@ -3003,7 +3008,13 @@ static struct rpc_xprt *xs_setup_tcp(struct
> > xprt_create *args)
> >       xprt->idle_timeout = XS_IDLE_DISC_TO;
> >
> >       xprt->ops = &xs_tcp_ops;
> > -     xprt->timeout = &xs_tcp_default_timeout;
> > +
> > +     xprt->timeout = xprt_alloc_timeout(args->timeout,
> > &xs_tcp_default_timeout);
> > +     if (IS_ERR(xprt->timeout))
> > +     {
> > +             ret = ERR_CAST(xprt->timeout);
> > +             goto out_err;
> > +     }
> >
> >       xprt->max_reconnect_timeout = xprt->timeout->to_maxval;
> >       xprt->connect_timeout = xprt->timeout->to_initval *
> > @@ -3071,7 +3082,12 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct
> > xprt_create *args)
> >       xprt->prot = IPPROTO_TCP;
> >       xprt->xprt_class = &xs_bc_tcp_transport;
> >       xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
> > -     xprt->timeout = &xs_tcp_default_timeout;
> > +     xprt->timeout = xprt_alloc_timeout(args->timeout,
> > &xs_tcp_default_timeout);
> > +     if (IS_ERR(xprt->timeout))
> > +     {
> > +             ret = ERR_CAST(xprt->timeout);
> > +             goto out_err;
> > +     }
> >
> >       /* backchannel */
> >       xprt_set_bound(xprt);
> > --
> > 2.39.1
> >
> 
> --
> Jeff Layton <jlayton@xxxxxxxxxx>

Andrew Klaassen






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux