Following up, using NFS loopback with a 5GB scratch device on a Google Compute Engine VM, generic/476 passes using a 4.14 LTS, 4.19 LTS, and 5.4 LTS kernel. So this looks like it's a regression which is in 5.10 LTS and newer kernels, and so instead of patching it out of the test, I think the right thing to do is to add it to a kernel version-specific exclude file and then filing a bug with the NFS folks. KERNEL: kernel 4.14.284-xfstests #8 SMP Tue Jul 5 08:21:37 EDT 2022 x86_64 CMDLINE: -c nfs/default generic/476 CPUS: 2 MEM: 7680 nfs/loopback: 1 tests, 597 seconds generic/476 Pass 595s Totals: 1 tests, 0 skipped, 0 failures, 0 errors, 595s --- KERNEL: kernel 4.19.248-xfstests #4 SMP Sat Jun 25 10:43:45 EDT 2022 x86_64 CMDLINE: -c nfs/default generic/476 CPUS: 2 MEM: 7680 nfs/loopback: 1 tests, 407 seconds generic/476 Pass 407s Totals: 1 tests, 0 skipped, 0 failures, 0 errors, 407s ---- KERNEL: kernel 5.4.199-xfstests #21 SMP Sun Jul 3 12:15:15 EDT 2022 x86_64 CMDLINE: -c nfs/default generic/476 CPUS: 2 MEM: 7680 nfs/loopback: 1 tests, 404 seconds generic/476 Pass 404s Totals: 1 tests, 0 skipped, 0 failures, 0 errors, 404s See below for what I'm checking into xfstests-bld for {kvm,gce}-xfstests. I don't believe we should be changing xfstests's generic/476, since it *does* pass with a smaller scratch device on older kernels, and presumably, RHEL customers would be cranky if this issue resulted in their production systems to lock up, and so it should be considered a kernel bug as opposed to a test bug. - Ted commit 4a33b6721d5db9c07f295a10a8ad65d2a0021406 Author: Theodore Ts'o <tytso@xxxxxxx> Date: Thu Jul 21 09:54:50 2022 -0400 test-appliance: add an nfs test exclusions for kernels newer than 5.4 This is apparently an NFS bug which is visible in 5.10 LTS and newer kernels, and likely appeared sometime after 5.4. Since it causes the test VM to spin forever (or at least for days), let's exclude it for now. Link: https://lore.kernel.org/all/CAHLe9YaAVyBmmM8T27dudvoeAxbJ_JMQmkz7tdM1ZLnpeQW4UQ@xxxxxxxxxxxxxx/ Signed-off-by: Theodore Ts'o <tytso@xxxxxxx> diff --git a/test-appliance/files/root/fs/nfs/exclude b/test-appliance/files/root/fs/nfs/exclude index 184750fb..ef4b19bc 100644 --- a/test-appliance/files/root/fs/nfs/exclude +++ b/test-appliance/files/root/fs/nfs/exclude @@ -10,3 +10,14 @@ generic/477 // failing in the expected output of the linux-nfs Wiki page. So we'll // suppress this failure for now. generic/294 + +#if LINUX_VERSION_CODE > KERNEL_VERSION(5,4,0) +// There appears to be a regression that shows up sometime after 5.4. +// LTS kernels for 4.14, 4.19, and 5.4 will terminate successfully, +// but newer kernels will spin forever in some kind of deadlock or livelock +// This apparently does not happen if the scratch device is > 27GB, so it +// may be some kind of ENOSPC-related bug. +// For more information see the e-mail thread starting at: +// https://lore.kernel.org/r/CAHLe9YaAVyBmmM8T27dudvoeAxbJ_JMQmkz7tdM1ZLnpeQW4UQ@xxxxxxxxxxxxxx/ +generic/476 +#endif