On Sun, May 16, 2021 at 11:18 PM Michael Wakabayashi <mwakabayashi@xxxxxxxxxx> wrote: > > Hi, > > We're seeing what looks like an NFSv4 issue. > > Mounting an NFS server that is down (ping to this NFS server's IP address does not respond) will block _all_ other NFS mount attempts even if the NFS servers are available and working properly (these subsequent mounts hang). > > If I kill the NFS mount process that's trying to mount the dead NFS server, the NFS mounts that were blocked will immediately unblock and mount successfully, which suggests the first mount command is blocking the other mount commands. > > > I verified this behavior using a newly built mount.nfs command from the recent nfs-utils 2.5.3 package installed on a recent version of Ubuntu Cloud Image 21.04: > * https://sourceforge.net/projects/nfs/files/nfs-utils/2.5.3/ > * https://cloud-images.ubuntu.com/releases/hirsute/release-20210513/ubuntu-21.04-server-cloudimg-amd64.ova > > > The reason this looks like it is specific to NFSv4 is from the following output showing "vers=4.2": > > $ strace /sbin/mount.nfs <unreachable-IP-address>:/path /tmp/mnt > > [ ... cut ... ] > > mount("<unreadhable-IP-address>:/path", "/tmp/mnt", "nfs", 0, "vers=4.2,addr=<unreachable-IP-address>,clien"...^C^Z > > Also, if I try the same mount.nfs commands but specifying NFSv3, the mount to the dead NFS server hangs, but the mounts to the operational NFS servers do not block and mount successfully; this bug doesn't happen when using NFSv3. > > > We reported this issue under util-linux here: > https://github.com/karelzak/util-linux/issues/1309 > [mounting nfs server which is down blocks all other nfs mounts on same machine #1309] > > I also found an older bug on this mailing list that had similar symptoms (but could not tell if it was the same problem or not): > https://patchwork.kernel.org/project/linux-nfs/patch/87vaori26c.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/ > [[PATCH/RFC] NFSv4: don't let hanging mounts block other mounts] > > Thanks, Mike Hi Mike, This is not a helpful reply but I was curious if I could reproduce your issue but was not successful. I'm able to initiate a mount to an unreachable-IP-address which hangs and then do another mount to an existing server without issues. Ubuntu 21.04 seems to be 5.11 based so I tried upstream 5.11 and I tried the latest upstream nfs-utils (instead of what my distro has which was an older version). To debug, perhaps get an output of the nfs4 and sunrpc tracepoints. Or also get output from dmesg after doing “echo t > /proc/sysrq-trigger” to see where the mounts are hanging.