Re: NFS server regression in kernel 5.13 (tested w/ 5.13.9)

Mike Javorski <mike.javorski@xxxxxxxxx> · Thu, 26 Aug 2021 23:11:12 -0700

Neil:

I am actually compiling a 5.13.13 kernel with the patch that Chuck
suggested earlier right now. I am doing the full compile matching the
distro compile as I don't have a targeted kernel config ready to go
(it's been years), and I want to test like for like anyway. It should
be ready to install in the AM, my time, so I will test with that first
tomorrow and see if it resolves the issue, if not, I will report back
and then try your revert suggestion. On the issue of memory though, my
server has 16GB of memory (and free currently shows ~1GB unused, and
~11GB in buffers/caches), so this really shouldn't be an available
memory issue, but I guess we'll find out.

Thanks for the info.

- mike

On Thu, Aug 26, 2021 at 10:27 PM NeilBrown <neilb@xxxxxxx> wrote:
>
>
> [[Mel: if you read through to the end you'll see why I cc:ed you on this]]
>
> On Fri, 27 Aug 2021, Mike Javorski wrote:
> > I just tried the same mount with 4 different nfsvers values: 3, 4.0, 4.1 and 4.2
> >
> > At first I thought it might be "working" because I only got freezes
> > with 4.2 at first, but I went back and re-tested (to be sure) and got
> > freezes with all 4 versions. So the nfsvers setting doesn't seem to
> > have an impact. I did verify at each pass that the 'nfsvers=' value
> > was present and correct in the mount output.
> >
> > FYI: another user posted on the archlinux reddit with a similar issue,
> > I suggested they try with a 5.12 kernel and that "solved" the issue
> > for them as well.
>
> well... I have good news and I have bad news.
>
> First the good.
> I reviewed all the symptoms again, and browsed the commits between
> working and not-working, and the only pattern that made any sense was
> that there was some issue with memory allocation.  The pauses - I
> reasoned - were most likely pauses while allocating memory.
>
> So instead of testing in a VM with 2G of RAM, I tried 512MB, and
> suddenly the problem was trivial to reproduce.  Specifically I created a
> (sparse) 1GB file on the test VM, exported it over NFS, and ran "md5sum"
> on the file from an NFS client.  With 5.12 this reliably takes about 90 seconds
> (as it does with 2G RAM).  On 5.13 and 512MB RAM, it usually takes a lot
> longer.  5, 6, 7, 8 minutes (and assorted seconds).
>
> The most questionable nfsd/ memory related patch in 5.13 is
>
>  Commit f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator")
>
> I reverted that and now the problem is no longer there.  Gone.  90seconds
> every time.
>
> Now the bad news: I don't know why.  That patch should be a good patch,
> with a small performance improvement, particularly at very high loads.
> (maybe even a big improvement at very high loads).
> The problem must be in alloc_pages_bulk_array(), which is a new
> interface, so not possible to bisect.
>
> So I might have a look at the code next week, but I've cc:ed Mel Gorman
> in case he comes up with some ideas sooner.
>
> For now, you can just revert that patch.
>
> Thanks for all the testing you did!!  It certainly helped.
>
> NeilBrown
>