On Wed, Jul 17, 2013 at 07:03:19PM -0500, Ben Myers wrote: > Hey Bruce, > > On Tue, Jul 16, 2013 at 10:24:30AM -0400, J.Bruce Fields wrote: > > Adding Ben Myers to Cc: in case he has any testing help or advice (see > > below): > > > > On Tue, Jul 16, 2013 at 02:00:21PM +1000, NeilBrown wrote: > > > That would be because I only allowed extra requests up to half the number of > > > idle threads, and there would be zero idle threads. > > > > > > So yes - I see your point. > > > > > > I now think that it is sufficient to just allow one request through per > > > socket. While there is high memory pressure, that is sufficient. When the > > > memory pressure drops, that should be enough to cause sndbuf to grow. > > > > > > We should be able to use xpt_reserved to check if this is the only request > > > or not: > > > > A remaining possible bad case is if the number of "bad" sockets is more > > than the number of threads, and if the problem is something that won't > > resolve itself by just waiting a little while. > > > > I don't know if that's likely in practice. Maybe a network failure cuts > > you off from a large swath (but not all) of your clients? Or maybe some > > large proportion of your clients are just slow? > > > > But this is two lines and looks likely to solve the immediate > > problem: > > > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > > > index 80a6640..5b832ef 100644 > > > --- a/net/sunrpc/svc_xprt.c > > > +++ b/net/sunrpc/svc_xprt.c > > > @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > > > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) > > > return true; > > > if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) > > > - return xprt->xpt_ops->xpo_has_wspace(xprt); > > > + return xprt->xpt_ops->xpo_has_wspace(xprt) || > > > + atomic_read(&xprt->xpt_reserved) == 0; > > > return false; > > > } > > > > > > @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt) > > > newxpt = xprt->xpt_ops->xpo_accept(xprt); > > > if (newxpt) > > > svc_add_new_temp_xprt(serv, newxpt); > > > - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) { > > > + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) || > > > + atomic_read(&xprt->xpt_reserved) == 0) { > > > /* XPT_DATA|XPT_DEFERRED case: */ > > > dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n", > > > rqstp, rqstp->rq_pool->sp_id, xprt, > > > > > > > > > Thoughts? > > > > > > I tried testing this, but xpo_has_wspace never fails for me, even if I remove > > > the calls to svc_sock_setbufsize (which probably aren't wanted for TCP any > > > more). > > > > Ben, do you still have a setup that can reproduce the problem? Or did you > > ever find an easier way to reproduce? > > Unfortunately I don't still have that test rig handy, and I did not find an > easier way to reproduce this. I do agree that it 'is sufficient to just allow > one request through per socket'... probably that would have solved the problem > in our case. Maybe it would help to limit the amount of memory in your system > to try and induce additional memory pressure? IIRC it was a 1mb block size > read workload where we would hit this. That makes sense. Or I wonder if you could cheat and artificially force the tcp code to behave as if there was memory pressure. (E.g. by calling tcp_enter_memory_pressure on some trigger). Neil, are you still looking at this? (If you're done with it, could you resend that patch with a signed-off-by?) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html