[[Mel: if you read through to the end you'll see why I cc:ed you on this]] On Fri, 27 Aug 2021, Mike Javorski wrote: > I just tried the same mount with 4 different nfsvers values: 3, 4.0, 4.1 and 4.2 > > At first I thought it might be "working" because I only got freezes > with 4.2 at first, but I went back and re-tested (to be sure) and got > freezes with all 4 versions. So the nfsvers setting doesn't seem to > have an impact. I did verify at each pass that the 'nfsvers=' value > was present and correct in the mount output. > > FYI: another user posted on the archlinux reddit with a similar issue, > I suggested they try with a 5.12 kernel and that "solved" the issue > for them as well. well... I have good news and I have bad news. First the good. I reviewed all the symptoms again, and browsed the commits between working and not-working, and the only pattern that made any sense was that there was some issue with memory allocation. The pauses - I reasoned - were most likely pauses while allocating memory. So instead of testing in a VM with 2G of RAM, I tried 512MB, and suddenly the problem was trivial to reproduce. Specifically I created a (sparse) 1GB file on the test VM, exported it over NFS, and ran "md5sum" on the file from an NFS client. With 5.12 this reliably takes about 90 seconds (as it does with 2G RAM). On 5.13 and 512MB RAM, it usually takes a lot longer. 5, 6, 7, 8 minutes (and assorted seconds). The most questionable nfsd/ memory related patch in 5.13 is Commit f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator") I reverted that and now the problem is no longer there. Gone. 90seconds every time. Now the bad news: I don't know why. That patch should be a good patch, with a small performance improvement, particularly at very high loads. (maybe even a big improvement at very high loads). The problem must be in alloc_pages_bulk_array(), which is a new interface, so not possible to bisect. So I might have a look at the code next week, but I've cc:ed Mel Gorman in case he comes up with some ideas sooner. For now, you can just revert that patch. Thanks for all the testing you did!! It certainly helped. NeilBrown