[linux-net added: a stubborn NFS client hang doing uncompression of some Steam games over NFS now looks like it might be TCP-stack or even network-device-related rather than NFS-related] On 24 Feb 2021, bfields@xxxxxxxxxxxx said: > On Tue, Feb 23, 2021 at 11:58:51PM +0000, Trond Myklebust wrote: >> On Tue, 2021-02-23 at 17:57 -0500, J. Bruce Fields wrote: >> > On Sun, Jan 03, 2021 at 02:27:51PM +0000, Nick Alcock wrote: >> > > Relevant exports, from /proc/fs/nfs/exports: >> > > >> > > / 192.168.16.0/24,silk.wkstn.nix(ro,insecure,no_root_squash,s >> > > ync,no_wdelay,no_subtree_check,v4root,fsid=0,uuid=0a4a4563:00764033 >> > > :8c827c0e:989cf534,sec=390003:390004:390005:1) >> > > /home/.loom.srvr.nix *.srvr.nix,fold.srvr.nix(rw,root_squash,syn >> > > c,wdelay,no_subtree_check,uuid=0a4a4563:00764033:8c827c0e:989cf534, >> > > sec=1) >> >> Isn't that trying to export the same filesystem as '/' on the line >> above using conflicting export options? > > Yes, and exporting something on the root filesystem is generally a bad > idea for the usual reasons. > > I don't see any explanation for the write hang there, though. > > Unless maybe if it leaves mountd unable to answer an upcall for some > reason, hm. > > I think that would show up in /proc/net/rpc/nfsd.fh/content. Try > cat'ing that file after 'rpcdebug -m rpc -s cache' and that should show > if there's an upcall that's not getting answered. OK, I got some debugging: this is still happening on 5.11, but it's starting to look more like a problem below NFS. There are no unanswered upcalls, and though I did eventually manage to get a multimegabyte pile of NFS debugging info (after fighting with the problem that the Steam client has internal timeouts that are silently exceeded and break things if you just leave debugging on)... I suspect I don't need to bother you with it, because the packet capture was more informative. I have a complete capture of everything on the wire from the moment Steam started, but it's nearly 150MiB xz'ed and includes a lot of boring nonsense related to Steam's revalidation of everything because the last startup crashed: it probably also includes things like my Steam account password, getdents of my home directory etc so if you want that I can send you the details privately: but this capture of only the last few thousand NFS packets is interesting enough: <http://www.esperi.org.uk/~nix/temporary/nfs-trouble.cap>. Looking at this in Wireshark, everything goes fine (ignoring the usual offloading-associated checksum whining) until packet 1644, when we suddenly start seeing out-of-order packets and retransmissions (on an otherwise totally idle 10GbE subnet). They get more and more common, and at packet 1711 the client loses patience and hits the host with a RST. This is... not likely to do NFS any good at all, and with $HOME served over the same connection I'm not surprised the client goes fairly unresponsive after that and the hangcheck timer starts screaming about processes blocked on I/O to some of the odder filesystems I'm getting over NFS on that client, like /var/tmp and /var/account/acct (so, yes, all process termination likely blocks forever on this). So it looks like NFS is being betrayed by TCP and/or some lower layer. (Also, NFS doesn't seem to try too hard to recover -- or perhaps, when it does, the new session goes just as wrong). The socket statistics on the server report 22 rx drops since boot (out of 11903K) and no errors of any kind: possibly each of these drops corresponds to the test runs I've been doing, but there can't be more than one or two drops per test (I must have crashed the client over ten times trying to get dumps) which if true seems more fragile than I expect NFS to be :) so they might well just be coincidental. I simplified my network for this test run, so the link has literally no hosts on it right now other than the NFS server and client at issue and a 10GbE Netgear GS110EMX switch, which reports zero errors and no packets dropped since it was rebooted months ago. Both ends are the same Intel X540-AT2 driven by ixgbe. The network cabling has never given me the least cause for concern, and I've never seen things getting hit by RSTs when talking to this host before. I've certainly never seen *NFS* getting a RST. It happens consistently, anyway, which argues strongly against the cabling :) Neither client nor server are using any sort of iptables (it's not even compiled in), and while the server is doing a bit of policy-based routing and running a couple of bridges for VMs it's not doing anything that should cause this (and this was replicated when the bridges had nothing on them other than the host in any case). Syncookies are compiled out on both hosts (they're still on on the firewall, but that's not participating in this and isn't even on the same subnet). Just in case: complete network-related sysctls changed from default on the client: net.ipv6.conf.default.keep_addr_on_down = 1 and on the server: net.ipv4.ip_forward = 1 net.ipv6.conf.all.forwarding = 1 net.ipv6.conf.default.keep_addr_on_down = 1 net.ipv4.ping_group_range = 73 73 net.core.bpf_jit_enable = 1 None of these seem too likely to cause *this*. I wonder how I could track down where this mess is coming from...