On Fri, 27 Aug 2021, Mike Javorski wrote: > Neil: > > I am actually compiling a 5.13.13 kernel with the patch that Chuck > suggested earlier right now. I am doing the full compile matching the > distro compile as I don't have a targeted kernel config ready to go > (it's been years), and I want to test like for like anyway. It should > be ready to install in the AM, my time, so I will test with that first > tomorrow and see if it resolves the issue, if not, I will report back > and then try your revert suggestion. On the issue of memory though, my > server has 16GB of memory (and free currently shows ~1GB unused, and > ~11GB in buffers/caches), so this really shouldn't be an available > memory issue, but I guess we'll find out. > > Thanks for the info. Take your time. Just FYI, the fix Chuck identified doesn't match your symptoms. That bug can only occur if /sys/module/sunrpc/parameters/svc_rpc_per_connection_limit is non-zero. When it does occur, the TCP connection completely freezes - no further traffic. IT won't even close. I took a break and got some fresh air and now I understand the problem. Please try the patch below, not the revert I suggested. The pause can, I think, be caused by fragmented memory - not just low memory. If only 1/16 of your memory is free, it could easily be fragmented. Thanks, NeilBrown Subject: [PATCH] SUNRPC: don't pause on incomplete allocation alloc_pages_bulk_array() attempts to allocate at least one page based on the provided pages, and then opportunistically allocates more if that can be done without dropping the spinlock. So if it returns fewer than requested, that could just mean that it needed to drop the lock. In that case, try again immediately. Only pause for a time if no progress could be made. Signed-off-by: NeilBrown <neilb@xxxxxxx> --- net/sunrpc/svc_xprt.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index d66a8e44a1ae..99268dd95519 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -662,7 +662,7 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) { struct svc_serv *serv = rqstp->rq_server; struct xdr_buf *arg = &rqstp->rq_arg; - unsigned long pages, filled; + unsigned long pages, filled, prev; pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT; if (pages > RPCSVC_MAXPAGES) { @@ -672,11 +672,14 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) pages = RPCSVC_MAXPAGES; } - for (;;) { + for (prev = 0;; prev = filled) { filled = alloc_pages_bulk_array(GFP_KERNEL, pages, rqstp->rq_pages); if (filled == pages) break; + if (filled > prev) + /* Made progress, don't sleep yet */ + continue; set_current_state(TASK_INTERRUPTIBLE); if (signalled() || kthread_should_stop()) {