On Mon, 09 Aug 2021, Mike Javorski wrote: > I have been experiencing nfs file access hangs with multiple release > versions of the 5.13.x linux kernel. In each case, all file transfers > freeze for 5-10 seconds and then resume. This seems worse when reading > through many files sequentially. A particularly useful debugging tool for NFS freezes is to run rpcdebug -m rpc -c all while the system appears frozen. As you only have a 5-10 second window this might be tricky. Setting or clearing debug flags in the rpc module (whether they are already set or not) has a side effect if listing all RPC "tasks" which a waiting for a reply. Seeing that task list can often be useful. The task list appears in "dmesg" output. If there are not tasks waiting, nothing will be written which might lead you to think it didn't work. As Chuck hinted, tcpdump is invaluable for this sort of problem. tcpdump -s 0 -w /tmp/somefile.pcap port 2049 will capture NFS traffic. If this can start before a hang, and finish after, it may contain useful information. Doing that in a way that doesn't create an enormous file might be a challenge. It would help if you found a way trigger the problem. Take note of the circumstances when it seems to happen the most. If you can only produce a large file, we can probably still work with it. tshark -r /tmp/somefile.pcap will report the capture one line per packet. You can look for the appropriate timestamp, note the frame numbers, and use "editcap" to extract a suitable range of packets. NeilBrown > > My server: > - Archlinux w/ a distribution provided kernel package > - filesystems exported with "rw,sync,no_subtree_check,insecure" options > > Client: > - Archlinux w/ latest distribution provided kernel (5.13.9-arch1-1 at writing) > - nfs mounted via /net autofs with "soft,nodev,nosuid" options > (ver=4.2 is indicated in mount) > > I have tried the 5.13.x kernel several times since the first arch > release (most recently with 5.13.9-arch1-1), all with similar results. > Each time, I am forced to downgrade the linux package to a 5.12.x > kernel (5.12.15-arch1 as of writing) to clear up the transfer issues > and stabilize performance. No other changes are made between tests. I > have confirmed the freezing behavior using both ext4 and btrfs > filesystems exported from this server. > > At this point I would appreciate some guidance in what to provide in > order to diagnose and resolve this issue. I don't have a lot of kernel > debugging experience, so instruction would be helpful. > > - mike > >