Dear all, disclaimer: this email was originally posted to linux-nfs since we believed the problem to be nfsd, but Chuck Lever suggested that rq_qos_wait hinted at a problem further down in the storage stack and referred to you guys, so here we are: for our researchers we are running file servers in the hundreds-of-TiB to low-PiB range that export via NFS and SMB. Storage is iSCSI-over-Infiniband LUNs LVM'ed into individual XFS file systems. With Ubuntu 18.04 nearing EOL, we prepared an upgrade to Debian bookworm and tests went well. About a week after one of the upgrades, we ran into the first occurence of our problem: all of a sudden, all nfsds enter the D state and are not recoverable. However, the underlying file systems seem fine and can be read and written to. The only way out appears to be to reboot the server. The only clues are the frozen nfsds and strack traces like [<0>] rq_qos_wait+0xbc/0x130 [<0>] wbt_wait+0xa2/0x110 [<0>] __rq_qos_throttle+0x20/0x40 [<0>] blk_mq_submit_bio+0x2d3/0x580 [<0>] submit_bio_noacct_nocheck+0xf7/0x2c0 [<0>] iomap_submit_ioend+0x4b/0x80 [<0>] iomap_do_writepage+0x4b4/0x820 [<0>] write_cache_pages+0x180/0x4c0 [<0>] iomap_writepages+0x1c/0x40 [<0>] xfs_vm_writepages+0x79/0xb0 [xfs] [<0>] do_writepages+0xbd/0x1c0 [<0>] filemap_fdatawrite_wbc+0x5f/0x80 [<0>] __filemap_fdatawrite_range+0x58/0x80 [<0>] file_write_and_wait_range+0x41/0x90 [<0>] xfs_file_fsync+0x5a/0x2a0 [xfs] [<0>] nfsd_commit+0x93/0x190 [nfsd] [<0>] nfsd4_commit+0x5e/0x90 [nfsd] [<0>] nfsd4_proc_compound+0x352/0x660 [nfsd] [<0>] nfsd_dispatch+0x167/0x280 [nfsd] [<0>] svc_process_common+0x286/0x5e0 [sunrpc] [<0>] svc_process+0xad/0x100 [sunrpc] [<0>] nfsd+0xd5/0x190 [nfsd] [<0>] kthread+0xe6/0x110 [<0>] ret_from_fork+0x1f/0x30 (we've also seen nfsd3). It's very sporadic, we have no idea what's triggering it and it has now happened 4 times on one server and once on a second. Needless to say, these are production systems, so we have a window of a few minutes for debugging before people start yelling. We've thrown everything we could at our test setup but so far haven't been able to trigger it. Any pointers would be highly appreciated. thanks and best regards, -Christian cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" uname -vr 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) apt list --installed '*nfs*' libnfsidmap1/testing,now 1:2.6.2-4 amd64 [installed,automatic] nfs-common/testing,now 1:2.6.2-4 amd64 [installed] nfs-kernel-server/testing,now 1:2.6.2-4 amd64 [installed] nfsconf -d [exportd] debug = all [exportfs] debug = all [general] pipefs-directory = /run/rpc_pipefs [lockd] port = 32769 udp-port = 32769 [mountd] debug = all manage-gids = True port = 892 [nfsd] debug = all port = 2049 threads = 48 [nfsdcld] debug = all [nfsdcltrack] debug = all [sm-notify] debug = all outgoing-port = 846 [statd] debug = all outgoing-port = 2020 port = 662 -- Dr. Christian Herzog <herzog@xxxxxxxxxxxx> support: +41 44 633 26 68 Head, IT Services Group, HPT H 8 voice: +41 44 633 39 50 Department of Physics, ETH Zurich 8093 Zurich, Switzerland http://isg.phys.ethz.ch/ ----- End forwarded message ----- -- Dr. Christian Herzog <herzog@xxxxxxxxxxxx> support: +41 44 633 26 68 Head, IT Services Group, HPT H 8 voice: +41 44 633 39 50 Department of Physics, ETH Zurich 8093 Zurich, Switzerland http://isg.phys.ethz.ch/