> On Jul 10, 2023, at 3:56 AM, Christoph Hellwig <hch@xxxxxx> wrote: > > On Sat, Jul 08, 2023 at 06:30:26PM +0000, Chuck Lever III wrote: >> Hi - >> >> I have a "standard" test of running the git regression suite with >> many threads against an NFS mount. I found that with 6.5-rc, the >> test stalled and several nfsd threads on the server were stuck >> in D state. > > Can you paste the exact reproducer here? It's a twisty little maze of scripts, but it does essentially this: 1. export a test filesystem on system B 2. mount that export on system A via NFS (I think I used NFSv4.1) 3. download the latest git tarball on system A 4. unpack the tarball on the test NFS mount on system A. umount / mount 5. "make -jN all docs" on system A, where N is nprocs. umount / mount 6. "make -jN test" on system A, where N is as in step 5. (For "make test" to work, the mounted on dir on system A has to be exactly the same for all steps). My system A has 12 cores, and B has 4, fwiw. The network fabric is InfiniBand, but I suspect that won't make much difference. During step 6, the tests will slow down and then stop cold. After another two minutes, on system B you'll start to see the INFO splats about hung processes. As an interesting side note, I have a btrfs filesystem on that same mapper group and physical device. I'm not able to reproduce the problem on that filesystem. >> I can reproduce this stall 100% with both an xfs and an ext4 >> export, so I bisected with both, and both bisects landed on the >> same commit: > >> On system 1: the exports are on top of /dev/mapper and reside on >> an "INTEL SSDSC2BA400G3" SATA device. >> >> On system 2: the exports are on top of /dev/mapper and reside on >> an "INTEL SSDSC2KB240G8" SATA device. >> >> System 1 was where I discovered the stall. System 2 is where I ran >> the bisects. > > Ok. I'd be curious if this reproducers without either device mapper > or on a non-SATA device. If you have an easy way to run it in a VM > that'd be great. Otherwise I'll try to recreate it in various > setups if you post the exact reproducer. I have a way to test it on an xfs export backed by a pair of AIC NVMe devices. Stand by. -- Chuck Lever