> On Aug 27, 2021, at 6:00 PM, Mike Javorski <mike.javorski@xxxxxxxxx> wrote: > > OK, an update. Several hours of spaced out testing sessions and the > first patch seems to have resolved the issue. There may be a very tiny > bit of lag that still occurs when opening/processing new files, but so > far on this kernel I have not had any multi-second freezes. I am still > waiting on the kernel with Neil's patch to compile (compiling on this > underpowered server so it's taking several hours), but I think the > testing there will just be to see if I can show it works still, and > then to try and test in a memory constrained VM. To see if I can > recreate Neil's experiment. Likely will have to do this over the > weekend given the kernel compile delay + fiddling with a VM. Thanks for your testing! > Chuck: I don't mean to overstep bounds, but is it possible to get that > patch pulled into 5.13 stable? That may help things for several people > while 5.14 goes through it's shakedown in archlinux prior to release. The patch had a Fixes: tag, so it should get automatically backported to every kernel that has the broken commit. If you don't see it in a subsequent 5.13 stable kernel, you are free to ask the stable maintainers to consider it. > - mike > > On Fri, Aug 27, 2021 at 10:07 AM Mike Javorski <mike.javorski@xxxxxxxxx> wrote: >> >> Chuck: >> I just booted a 5.13.13 kernel with your suggested patch. No freezes >> on the first test, but that sometimes happens so I will let the server >> settle some and try it again later in the day (which also would align >> with Neil's comment on memory fragmentation being a contributor). >> >> Neil: >> I have started a compile with the above kernel + your patch to test >> next unless you or Chuck determine that it isn't needed, or that I >> should test both patches discreetly. As the above is already merged to >> 5.14 it seemed logical to just add your patch on top. >> >> I will also try to set up a vm to test your md5sum scenario with the >> various kernels since it's a much faster thing to test. >> >> - mike >> >> On Fri, Aug 27, 2021 at 7:13 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: >>> >>> >>>> On Aug 27, 2021, at 3:14 AM, NeilBrown <neilb@xxxxxxx> wrote: >>>> >>>> Subject: [PATCH] SUNRPC: don't pause on incomplete allocation >>>> >>>> alloc_pages_bulk_array() attempts to allocate at least one page based on >>>> the provided pages, and then opportunistically allocates more if that >>>> can be done without dropping the spinlock. >>>> >>>> So if it returns fewer than requested, that could just mean that it >>>> needed to drop the lock. In that case, try again immediately. >>>> >>>> Only pause for a time if no progress could be made. >>> >>> The case I was worried about was "no pages available on the >>> pcplist", in which case, alloc_pages_bulk_array() resorts >>> to calling __alloc_pages() and returns only one new page. >>> >>> "No progess" would mean even __alloc_pages() failed. >>> >>> So this patch would behave essentially like the >>> pre-alloc_pages_bulk_array() code: call alloc_page() for >>> each empty struct_page in the array without pausing. That >>> seems correct to me. >>> >>> >>> I would add >>> >>> Fixes: f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator") >>> >>> >>>> Signed-off-by: NeilBrown <neilb@xxxxxxx> >>>> --- >>>> net/sunrpc/svc_xprt.c | 7 +++++-- >>>> 1 file changed, 5 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c >>>> index d66a8e44a1ae..99268dd95519 100644 >>>> --- a/net/sunrpc/svc_xprt.c >>>> +++ b/net/sunrpc/svc_xprt.c >>>> @@ -662,7 +662,7 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) >>>> { >>>> struct svc_serv *serv = rqstp->rq_server; >>>> struct xdr_buf *arg = &rqstp->rq_arg; >>>> - unsigned long pages, filled; >>>> + unsigned long pages, filled, prev; >>>> >>>> pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT; >>>> if (pages > RPCSVC_MAXPAGES) { >>>> @@ -672,11 +672,14 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) >>>> pages = RPCSVC_MAXPAGES; >>>> } >>>> >>>> - for (;;) { >>>> + for (prev = 0;; prev = filled) { >>>> filled = alloc_pages_bulk_array(GFP_KERNEL, pages, >>>> rqstp->rq_pages); >>>> if (filled == pages) >>>> break; >>>> + if (filled > prev) >>>> + /* Made progress, don't sleep yet */ >>>> + continue; >>>> >>>> set_current_state(TASK_INTERRUPTIBLE); >>>> if (signalled() || kthread_should_stop()) { >>> >>> -- >>> Chuck Lever >>> >>> >>> -- Chuck Lever