Re: Stalling NFS reads with "SUNRPC: refresh rq_pages using a bulk page allocator"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/12/21 10:51 PM, Chuck Lever III wrote:


On Oct 12, 2021, at 3:44 PM, Jussi Kansanen <jussi.kansanen@xxxxxxxxx> wrote:

Hello,

I started to get stalling NFS reads after upgrading to 5.13 kernel (from 5.12) and the issue still persist in 5.14. Bisection lead to commit f6e70aab9dfe0c2f79cf7dbcb1e80fa71dc60b09, reverting it from 5.13.19 seems to solve the issue.

Hello Jussi-

There have been several recent fixes in this area. Try v5.14.11.


Hello Chuck,

Upgrading to 5.14.12 fixed the problem, thanks.


The problem is as follows, and seems to be reproducible:

- Boot up system.

- Run "tar cf - /nfsshare/somelargedir | pv > /dev/null" or just any sequential read on a large enough file.

- Stalls start to happen usually after 2-32GB is read, though sometimes it can take up to 200GB of reads.

When stalling starts the transfer rate drops to zero and all NFS shares come unresponsive. Stalls usually last between 5-15 seconds and there's no errors logged, though sometimes "nfs server not responding" errors are logged on the client side, but those aren't typical. After the read resumes it only lasts few seconds before stall happens again and keeps repeating ...

Tests done with 10Gb network and kernels:

client:
- 5.14.9

server:
- 5.12.19 - OK
- 5.13.19 - stalls
- 5.13.19 - OK with f6e70aab9dfe0c2f79cf7dbcb1e80fa71dc60b09 reverted
- 5.14.7  - stalls

Server kernel config included as attachment.


-Jussi Kansanen
<config.gz>

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux