Re: [PATCH] SUNRPC: Use TCP_CORK to optimise send performance on the server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2021-02-13 at 21:53 +0000, Chuck Lever wrote:
> Hi Trond-
> 
> > On Feb 13, 2021, at 3:25 PM, trondmy@xxxxxxxxxx wrote:
> > 
> > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > 
> > Use a counter to keep track of how many requests are queued behind
> > the
> > xprt->xpt_mutex, and keep TCP_CORK set until the queue is empty.
> 
> I'm intrigued, but IMO, the patch description needs to explain
> why this change should be made. Why abandon Nagle?
> 

This doesn't change the Nagle/TCP_NODELAY settings. It just switches to
using the new documented kernel interface.

The only change is to use TCP_CORK so that we don't send out partially
filled TCP frames, when we can see that there are other RPC replies
that are queued up for transmission.

Note the combination TCP_CORK+TCP_NODELAY is common, and the main
effect of the latter is that when we turn off the TCP_CORK, then there
is an immediate forced push of the TCP queue.

> If you expect a performance impact, the description should
> provide metrics to show it.

I don't have a performance system to measure the improvement
accurately. However I am seeing an notable improvement with the
equivalent client change. Specifically, xfstests generic/127 shows a
~20% improvement on my VM based test rig.

> (We should have Daire try this change with his multi-client
> setup).
> 
> 
> > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > ---
> > include/linux/sunrpc/svcsock.h | 2 ++
> > net/sunrpc/svcsock.c           | 8 +++++++-
> > 2 files changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/sunrpc/svcsock.h
> > b/include/linux/sunrpc/svcsock.h
> > index b7ac7fe68306..bcc555c7ae9c 100644
> > --- a/include/linux/sunrpc/svcsock.h
> > +++ b/include/linux/sunrpc/svcsock.h
> > @@ -35,6 +35,8 @@ struct svc_sock {
> >         /* Total length of the data (not including fragment
> > headers)
> >          * received so far in the fragments making up this rpc: */
> >         u32                     sk_datalen;
> > +       /* Number of queued send requests */
> > +       atomic_t                sk_sendqlen;
> 
> Can you take advantage of xpt_mutex: update this field
> only in the critical section, and make it a simple
> integer type?
> 
> 
> >         struct page *           sk_pages[RPCSVC_MAXPAGES];      /*
> > received data */
> > };
> > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > index 5a809c64dc7b..231f510a4830 100644
> > --- a/net/sunrpc/svcsock.c
> > +++ b/net/sunrpc/svcsock.c
> > @@ -1171,18 +1171,23 @@ static int svc_tcp_sendto(struct svc_rqst
> > *rqstp)
> > 
> >         svc_tcp_release_rqst(rqstp);
> > 
> > +       atomic_inc(&svsk->sk_sendqlen);
> >         mutex_lock(&xprt->xpt_mutex);
> >         if (svc_xprt_is_dead(xprt))
> >                 goto out_notconn;
> > +       tcp_sock_set_cork(svsk->sk_sk, true);
> >         err = svc_tcp_sendmsg(svsk->sk_sock, &msg, xdr, marker,
> > &sent);
> >         xdr_free_bvec(xdr);
> >         trace_svcsock_tcp_send(xprt, err < 0 ? err : sent);
> >         if (err < 0 || sent != (xdr->len + sizeof(marker)))
> >                 goto out_close;
> > +       if (atomic_dec_and_test(&svsk->sk_sendqlen))
> > +               tcp_sock_set_cork(svsk->sk_sk, false);
> >         mutex_unlock(&xprt->xpt_mutex);
> >         return sent;
> > 
> > out_notconn:
> > +       atomic_dec(&svsk->sk_sendqlen);
> >         mutex_unlock(&xprt->xpt_mutex);
> >         return -ENOTCONN;
> > out_close:
> > @@ -1192,6 +1197,7 @@ static int svc_tcp_sendto(struct svc_rqst
> > *rqstp)
> >                   (err < 0) ? err : sent, xdr->len);
> >         set_bit(XPT_CLOSE, &xprt->xpt_flags);
> >         svc_xprt_enqueue(xprt);
> > +       atomic_dec(&svsk->sk_sendqlen);
> >         mutex_unlock(&xprt->xpt_mutex);
> >         return -EAGAIN;
> > }
> > @@ -1261,7 +1267,7 @@ static void svc_tcp_init(struct svc_sock
> > *svsk, struct svc_serv *serv)
> >                 svsk->sk_datalen = 0;
> >                 memset(&svsk->sk_pages[0], 0, sizeof(svsk-
> > >sk_pages));
> > 
> > -               tcp_sk(sk)->nonagle |= TCP_NAGLE_OFF;
> > +               tcp_sock_set_nodelay(sk);
> > 
> >                 set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> >                 switch (sk->sk_state) {
> > -- 
> > 2.29.2
> > 
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux