Re: [PATCH v3 12/12] sunrpc: Allow keepalive ping on a credit-full transport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2017-02-09 at 14:42 -0500, Chuck Lever wrote:
> > On Feb 9, 2017, at 10:37 AM, Chuck Lever <chuck.lever@xxxxxxxxxx>
> > wrote:
> > 
> > > 
> > > On Feb 8, 2017, at 7:48 PM, Trond Myklebust <trondmy@primarydata.
> > > com> wrote:
> > > 
> > > On Wed, 2017-02-08 at 19:19 -0500, Chuck Lever wrote:
> > > > > On Feb 8, 2017, at 7:05 PM, Trond Myklebust <trondmy@primaryd
> > > > > ata.co
> > > > > m> wrote:
> > > > > 
> > > > > On Wed, 2017-02-08 at 17:01 -0500, Chuck Lever wrote:
> > > > > > Allow RPC-over-RDMA to send NULL pings even when the
> > > > > > transport
> > > > > > has
> > > > > > hit its credit limit. One RPC-over-RDMA credit is reserved
> > > > > > for
> > > > > > operations like keepalive.
> > > > > > 
> > > > > > For transports that convey NFSv4, it seems like lease
> > > > > > renewal
> > > > > > would
> > > > > > also be a candidate for using a priority transport slot.
> > > > > > I'd like
> > > > > > to
> > > > > > see a mechanism better than RPCRDMA_PRIORITY that can
> > > > > > ensure only
> > > > > > one priority operation is in use at a time.
> > > > > > 
> > > > > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > > > ---
> > > > > > include/linux/sunrpc/sched.h    |    2 ++
> > > > > > net/sunrpc/xprt.c               |    4 ++++
> > > > > > net/sunrpc/xprtrdma/transport.c |    3 ++-
> > > > > > net/sunrpc/xprtrdma/verbs.c     |   13 ++++++++-----
> > > > > > 4 files changed, 16 insertions(+), 6 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/linux/sunrpc/sched.h
> > > > > > b/include/linux/sunrpc/sched.h
> > > > > > index 13822e6..fcea158 100644
> > > > > > --- a/include/linux/sunrpc/sched.h
> > > > > > +++ b/include/linux/sunrpc/sched.h
> > > > > > @@ -127,6 +127,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_TASK_TIMEOUT	0x1000		/*
> > > > > > fail
> > > > > > with
> > > > > > ETIMEDOUT on timeout */
> > > > > > #define RPC_TASK_NOCONNECT	0x2000		/*
> > > > > > return
> > > > > > ENOTCONN if not connected */
> > > > > > #define RPC_TASK_NO_RETRANS_TIMEOUT	0x4000		
> > > > > > /*
> > > > > > wait forever for a reply */
> > > > > > +#define RPC_TASK_NO_CONG	0x8000		/*
> > > > > > skip
> > > > > > congestion control */
> > > > > > 
> > > > > > #define RPC_TASK_SOFTPING	(RPC_TASK_SOFT |
> > > > > > RPC_TASK_SOFTCONN)
> > > > > > 
> > > > > > @@ -137,6 +138,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_IS_SOFT(t)		((t)->tk_flags &
> > > > > > (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
> > > > > > #define RPC_IS_SOFTCONN(t)	((t)->tk_flags &
> > > > > > RPC_TASK_SOFTCONN)
> > > > > > #define RPC_WAS_SENT(t)		((t)->tk_flags &
> > > > > > RPC_TASK_SENT)
> > > > > > +#define RPC_SKIP_CONG(t)	((t)->tk_flags &
> > > > > > RPC_TASK_NO_CONG)
> > > > > > 
> > > > > > #define RPC_TASK_RUNNING	0
> > > > > > #define RPC_TASK_QUEUED		1
> > > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > > > > index b530a28..a477ee6 100644
> > > > > > --- a/net/sunrpc/xprt.c
> > > > > > +++ b/net/sunrpc/xprt.c
> > > > > > @@ -392,6 +392,10 @@ static inline void
> > > > > > xprt_release_write(struct
> > > > > > rpc_xprt *xprt, struct rpc_task *ta
> > > > > > {
> > > > > > 	struct rpc_rqst *req = task->tk_rqstp;
> > > > > > 
> > > > > > +	if (RPC_SKIP_CONG(task)) {
> > > > > > +		req->rq_cong = 0;
> > > > > > +		return 1;
> > > > > > +	}
> > > > > 
> > > > > Why not just have the RDMA layer call xprt_reserve_xprt()
> > > > > (and
> > > > > xprt_release_xprt()) if this flag is set? It seems to me that
> > > > > you
> > > > > will
> > > > > need some kind of extra congestion control in the RDMA layer
> > > > > anyway
> > > > > since you only have one reserved credit for these privileged
> > > > > tasks
> > > > > (or
> > > > > did I miss where that is being gated?).
> > > > 
> > > > Thanks for the review.
> > > > 
> > > > See RPCRDMA_IA_RSVD_CREDIT in 11/12. It's a hack I'm not
> > > > terribly happy with.
> > > > 
> > > > So, I think you are suggesting replacing xprtrdma's
> > > > ->reserve_xprt with something like:
> > > > 
> > > > int xprt_rdma_reserve_xprt(xprt, task)
> > > > {
> > > >      if (RPC_SKIP_CONG(task))
> > > >           return xprt_reserve_xprt(xprt, task);
> > > >      return xprt_reserve_xprt_cong(xprt, task);
> > > > }
> > > > 
> > > > and likewise for ->release_xprt ?
> > > 
> > > Right.
> 
> This seems to work fine for the normal cases.
> 
> I'm confused about how to construct xprt_rdma_release_xprt()
> so it never releases a normal RPC task when a SKIP_CONG
> task completes and the credit limit is still full.
> 
> If it should send a normal task using the reserved credit
> and that task hangs too, we're in exactly the position
> we wanted to avoid.
> 
> My original solution might have had a similar problem,
> come to think of it.
> 
> 

That's true... You may need to set up a separate waitqueue that is
reserved for SKIP_CONG tasks. Again, it makes sense to keep that in the
RDMA code.

-- 


	
	


Trond Myklebust
Principal System Architect
4300 El Camino Real | Suite 100
Los Altos, CA  94022
W: 650-422-3800
C: 801-921-4583 
www.primarydata.com



��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux