Hello Trond, On Mon, Aug 22, 2022 at 5:01 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2022-08-22 at 16:43 +0200, Igor Raits wrote: > > [You don't often get email from igor@xxxxxxxxxxxx. Learn why this is > > important at https://aka.ms/LearnAboutSenderIdentification ] > > > > Hello Trond, > > > > On Mon, Aug 22, 2022 at 4:02 PM Trond Myklebust > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, 2022-08-22 at 10:16 +0200, Igor Raits wrote: > > > > [You don't often get email from igor@xxxxxxxxxxxx. Learn why this > > > > is > > > > important at https://aka.ms/LearnAboutSenderIdentification ] > > > > > > > > Hello everyone, > > > > > > > > Hopefully I'm sending this to the right place… > > > > We recently started to see the following stacktrace quite often > > > > on > > > > our > > > > VMs that are using NFS extensively (I think after upgrading to > > > > 5.18.11+, but not sure when exactly. For sure it happens on > > > > 5.18.15): > > > > > > > > INFO: task kworker/u36:10:377691 blocked for more than 122 > > > > seconds. > > > > Tainted: G E 5.18.15-1.gdc.el8.x86_64 #1 > > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > > > > message. > > > > task:kworker/u36:10 state:D stack: 0 pid:377691 ppid: 2 > > > > flags:0x00004000 > > > > Workqueue: writeback wb_workfn (flush-0:308) > > > > Call Trace: > > > > <TASK> > > > > __schedule+0x38c/0x7d0 > > > > schedule+0x41/0xb0 > > > > io_schedule+0x12/0x40 > > > > __folio_lock+0x110/0x260 > > > > ? filemap_alloc_folio+0x90/0x90 > > > > write_cache_pages+0x1e3/0x4d0 > > > > ? nfs_writepage_locked+0x1d0/0x1d0 [nfs] > > > > nfs_writepages+0xe1/0x200 [nfs] > > > > do_writepages+0xd2/0x1b0 > > > > ? check_preempt_curr+0x47/0x70 > > > > ? ttwu_do_wakeup+0x17/0x180 > > > > __writeback_single_inode+0x41/0x360 > > > > writeback_sb_inodes+0x1f0/0x460 > > > > __writeback_inodes_wb+0x5f/0xd0 > > > > wb_writeback+0x235/0x2d0 > > > > wb_workfn+0x348/0x4a0 > > > > ? put_prev_task_fair+0x1b/0x30 > > > > ? pick_next_task+0x84/0x940 > > > > ? __update_idle_core+0x1b/0xb0 > > > > process_one_work+0x1c5/0x390 > > > > worker_thread+0x30/0x360 > > > > ? process_one_work+0x390/0x390 > > > > kthread+0xd7/0x100 > > > > ? kthread_complete_and_exit+0x20/0x20 > > > > ret_from_fork+0x1f/0x30 > > > > </TASK> > > > > > > > > I see that something very similar was fixed in btrfs > > > > ( > > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commi > > > > t/?h=linux-5.18.y&id=9535ec371d741fa037e37eddc0a5b25ba82d0027) > > > > but I could not find anything similar for NFS. > > > > > > > > Do you happen to know if this is already fixed? If so, would you > > > > mind > > > > sharing some commits? If not, could you help getting this > > > > addressed? > > > > > > > > > > The stack trace you show above isn't particularly helpful for > > > diagnosing what the problem is. > > > > > > All it is saying is that 'thread A' is waiting to take a page lock > > > that > > > is being held by a different 'thread B'. Without information on > > > what > > > 'thread B' is doing, and why it isn't releasing the lock, there is > > > nothing we can conclude. > > > > Do you have some hint how to debug this issue further (when it > > happens > > again)? Would `virsh dump` to get a memory dump and then some kind of > > "bt all" via crash help to get more information? > > Or something else? > > > > Thanks in advance! > > -- > > Igor Raits > > Please try running the following two lines of 'bash' script as root: > > (for tt in $(grep -l 'nfs[^d]' /proc/*/stack); do echo "${tt}:"; cat ${tt}; echo; done) >/tmp/nfs_threads.txt > > cat /sys/kernel/debug/sunrpc/rpc_clnt/*/tasks > /tmp/rpc_tasks.txt > > and then send us the output from the two files /tmp/nfs_threads.txt and > /tmp/rpc_tasks.txt. > > The file nfs_threads.txt gives us a full set of stack traces from all > processes that are currently in the NFS client code. So it should > contain both the stack trace from your 'thread A' above, and the traces > from all candidates for the 'thread B' process that is causing the > blockage. > The file rpc_tasks.txt gives us the status of any RPC calls that might > be outstanding and might help diagnose any issues with the TCP > connection. > > That should therefore give us a better starting point for root causing > the problem. The rpc_tasks is empty but I got nfs_threads from the moment it is stuck (see attached file). It still happens with 5.19.3, 5.19.6. -- Igor Raits
/proc/1639450/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/1686659/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/218367/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/2579602/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/2625468/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/265062/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/3278822/stack: [<0>] pnfs_update_layout+0x603/0xed0 [nfsv4] [<0>] fl_pnfs_update_layout.constprop.18+0x23/0x1e0 [nfs_layout_nfsv41_files] [<0>] filelayout_pg_init_write+0x3a/0x70 [nfs_layout_nfsv41_files] [<0>] __nfs_pageio_add_request+0x294/0x470 [nfs] [<0>] nfs_pageio_add_request_mirror+0x2f/0x40 [nfs] [<0>] nfs_pageio_add_request+0x200/0x2d0 [nfs] [<0>] nfs_page_async_flush+0x120/0x310 [nfs] [<0>] nfs_writepages_callback+0x5b/0xc0 [nfs] [<0>] write_cache_pages+0x187/0x4d0 [<0>] nfs_writepages+0xe1/0x200 [nfs] [<0>] do_writepages+0xd2/0x1b0 [<0>] __writeback_single_inode+0x41/0x360 [<0>] writeback_sb_inodes+0x1f0/0x460 [<0>] __writeback_inodes_wb+0x5f/0xd0 [<0>] wb_writeback+0x235/0x2d0 [<0>] wb_workfn+0x312/0x4a0 [<0>] process_one_work+0x1c5/0x390 [<0>] worker_thread+0x30/0x360 [<0>] kthread+0xd7/0x100 [<0>] ret_from_fork+0x1f/0x30 /proc/3457988/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/3504204/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/3636913/stack: [<0>] grab_super+0x29/0x90 [<0>] sget_fc+0xd8/0x300 [<0>] nfs_get_tree_common+0x90/0x480 [nfs] [<0>] vfs_get_tree+0x22/0xc0 [<0>] nfs_do_submount+0x155/0x170 [nfs] [<0>] nfs4_submount+0x145/0x190 [nfsv4] [<0>] nfs_d_automount+0x182/0x240 [nfs] [<0>] __traverse_mounts+0xc6/0x210 [<0>] step_into+0x37e/0x760 [<0>] walk_component+0x6e/0x1b0 [<0>] path_lookupat.isra.50+0x67/0x140 [<0>] filename_lookup.part.64+0xc1/0x190 [<0>] vfs_path_lookup+0x7c/0x90 [<0>] mount_subtree+0xe2/0x160 [<0>] do_nfs4_mount+0x23a/0x3b0 [nfsv4] [<0>] nfs4_try_get_tree+0x44/0xb0 [nfsv4] [<0>] vfs_get_tree+0x22/0xc0 [<0>] path_mount+0x696/0x9b0 [<0>] do_mount+0x79/0x90 [<0>] __x64_sys_mount+0xd0/0xf0 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/45399/stack: [<0>] nfs41_callback_svc+0x1af/0x1c0 [nfsv4] [<0>] kthread+0xd7/0x100 [<0>] ret_from_fork+0x1f/0x30 /proc/629298/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd /proc/676956/stack: [<0>] folio_wait_bit+0xf8/0x230 [<0>] folio_wait_writeback+0x28/0x80 [<0>] __filemap_fdatawait_range+0x89/0x120 [<0>] filemap_write_and_wait_range+0x3e/0x70 [<0>] nfs_getattr+0x423/0x440 [nfs] [<0>] vfs_statx+0x97/0x120 [<0>] vfs_fstatat+0x51/0x70 [<0>] __do_sys_newstat+0x30/0x70 [<0>] do_syscall_64+0x37/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd