Re: file server freezes with all nfsds stuck in D state after upgrade to Debian bookworm

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Thu, 6 Apr 2023 13:48:06 +0000

> On Apr 6, 2023, at 7:09 AM, Christian Herzog <herzog@xxxxxxxxxxxx> wrote:
> 
> Dear all,
> 
> for our researchers we are running file servers in the hundreds-of-TiB to
> low-PiB range that export via NFS and SMB. Storage is iSCSI-over-Infiniband
> LUNs LVM'ed into individual XFS file systems. With Ubuntu 18.04 nearing EOL,
> we prepared an upgrade to Debian bookworm and tests went well. About a week
> after one of the upgrades, we ran into the first occurence of our problem: all
> of a sudden, all nfsds enter the D state and are not recoverable. However, the
> underlying file systems seem fine and can be read and written to. The only way
> out appears to be to reboot the server. The only clues are the frozen nfsds
> and strack traces like
> 
> [<0>] rq_qos_wait+0xbc/0x130
> [<0>] wbt_wait+0xa2/0x110

Hi Christian, you have a pretty deep storage stack!
rq_qos_wait is a few layers below NFSD. Jens Axboe
and linux-block are the folks who maintain that.

> [<0>] __rq_qos_throttle+0x20/0x40
> [<0>] blk_mq_submit_bio+0x2d3/0x580
> [<0>] submit_bio_noacct_nocheck+0xf7/0x2c0
> [<0>] iomap_submit_ioend+0x4b/0x80
> [<0>] iomap_do_writepage+0x4b4/0x820
> [<0>] write_cache_pages+0x180/0x4c0
> [<0>] iomap_writepages+0x1c/0x40
> [<0>] xfs_vm_writepages+0x79/0xb0 [xfs]
> [<0>] do_writepages+0xbd/0x1c0
> [<0>] filemap_fdatawrite_wbc+0x5f/0x80
> [<0>] __filemap_fdatawrite_range+0x58/0x80
> [<0>] file_write_and_wait_range+0x41/0x90
> [<0>] xfs_file_fsync+0x5a/0x2a0 [xfs]
> [<0>] nfsd_commit+0x93/0x190 [nfsd]
> [<0>] nfsd4_commit+0x5e/0x90 [nfsd]
> [<0>] nfsd4_proc_compound+0x352/0x660 [nfsd]
> [<0>] nfsd_dispatch+0x167/0x280 [nfsd]
> [<0>] svc_process_common+0x286/0x5e0 [sunrpc]
> [<0>] svc_process+0xad/0x100 [sunrpc]
> [<0>] nfsd+0xd5/0x190 [nfsd]
> [<0>] kthread+0xe6/0x110
> [<0>] ret_from_fork+0x1f/0x30
> 
> (we've also seen nfsd3). It's very sporadic, we have no idea what's triggering
> it and it has now happened 4 times on one server and once on a second.
> Needless to say, these are production systems, so we have a window of a few
> minutes for debugging before people start yelling. We've thrown everything we
> could at our test setup but so far haven't been able to trigger it.
> Any pointers would be highly appreciated.
> 
> 
> thanks and best regards,
> -Christian
> 
> 
> 
> cat /etc/os-release 
> PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
> 
> uname -vr
> 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19)
> 
> apt list --installed '*nfs*'
> libnfsidmap1/testing,now 1:2.6.2-4 amd64 [installed,automatic]
> nfs-common/testing,now 1:2.6.2-4 amd64 [installed]
> nfs-kernel-server/testing,now 1:2.6.2-4 amd64 [installed]
> 
> nfsconf -d
> [exportd]
> debug = all
> [exportfs]
> debug = all
> [general]
> pipefs-directory = /run/rpc_pipefs
> [lockd]
> port = 32769
> udp-port = 32769
> [mountd]
> debug = all
> manage-gids = True
> port = 892
> [nfsd]
> debug = all
> port = 2049
> threads = 48
> [nfsdcld]
> debug = all
> [nfsdcltrack]
> debug = all
> [sm-notify]
> debug = all
> outgoing-port = 846
> [statd]
> debug = all
> outgoing-port = 2020
> port = 662
> 
> 
> 
> -- 
> Dr. Christian Herzog <herzog@xxxxxxxxxxxx>  support: +41 44 633 26 68
> Head, IT Services Group, HPT H 8              voice: +41 44 633 39 50
> Department of Physics, ETH Zurich           
> 8093 Zurich, Switzerland                     http://isg.phys.ethz.ch/

--
Chuck Lever