Re: NFS server regression in kernel 5.13 (tested w/ 5.13.9)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 27 Aug 2021, Mike Javorski wrote:
> Neil:
> 
> I am actually compiling a 5.13.13 kernel with the patch that Chuck
> suggested earlier right now. I am doing the full compile matching the
> distro compile as I don't have a targeted kernel config ready to go
> (it's been years), and I want to test like for like anyway. It should
> be ready to install in the AM, my time, so I will test with that first
> tomorrow and see if it resolves the issue, if not, I will report back
> and then try your revert suggestion. On the issue of memory though, my
> server has 16GB of memory (and free currently shows ~1GB unused, and
> ~11GB in buffers/caches), so this really shouldn't be an available
> memory issue, but I guess we'll find out.
> 
> Thanks for the info.

Take your time.

Just FYI, the fix Chuck identified doesn't match your symptoms.
That bug can only occur if
   /sys/module/sunrpc/parameters/svc_rpc_per_connection_limit
is non-zero.  When it does occur, the TCP connection completely
freezes - no further traffic.  IT won't even close.

I took a break and got some fresh air and now I understand the problem.
Please try the patch below, not the revert I suggested.
The pause can, I think, be caused by fragmented memory - not just low
memory.  If only 1/16 of your memory is free, it could easily be
fragmented.

Thanks,
NeilBrown

Subject: [PATCH] SUNRPC: don't pause on incomplete allocation

alloc_pages_bulk_array() attempts to allocate at least one page based on
the provided pages, and then opportunistically allocates more if that
can be done without dropping the spinlock.

So if it returns fewer than requested, that could just mean that it
needed to drop the lock.  In that case, try again immediately.

Only pause for a time if no progress could be made.

Signed-off-by: NeilBrown <neilb@xxxxxxx>
---
 net/sunrpc/svc_xprt.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d66a8e44a1ae..99268dd95519 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -662,7 +662,7 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
 {
 	struct svc_serv *serv = rqstp->rq_server;
 	struct xdr_buf *arg = &rqstp->rq_arg;
-	unsigned long pages, filled;
+	unsigned long pages, filled, prev;
 
 	pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT;
 	if (pages > RPCSVC_MAXPAGES) {
@@ -672,11 +672,14 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
 		pages = RPCSVC_MAXPAGES;
 	}
 
-	for (;;) {
+	for (prev = 0;; prev = filled) {
 		filled = alloc_pages_bulk_array(GFP_KERNEL, pages,
 						rqstp->rq_pages);
 		if (filled == pages)
 			break;
+		if (filled > prev)
+			/* Made progress, don't sleep yet */
+			continue;
 
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (signalled() || kthread_should_stop()) {



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux