On Fri, 2018-11-30 at 16:19 +0000, Cristian Marussi wrote: > Hi > > On 29/11/2018 19:56, Trond Myklebust wrote: > > On Thu, 2018-11-29 at 19:28 +0000, Cristian Marussi wrote: > > > Hi Trond, Catalin > [snip] > > Question to you both: when this happens, does /proc/*/stack show > > any of > > the processes hanging in the socket or sunrpc code? If so, can you > > please send me examples of those stack traces (i.e. the contents of > > /proc/<pid>/stack for the processes that are hanging) > > (using a reverse shell since starting ssh causes a lot of pain and > traffic) > > Looking at NFS traffic holes(30-40 secs) to detect Client side > various HANGS > ------------------------------------------------------------------- > --------- > > root@sqwt-ubuntu:/opt/lkp-tests# nc -lk -e /bin/bash -s 192.168.0.1 > -p 1235 & > root@sqwt-ubuntu:/opt/lkp-tests# lkp run ./dbench-100%.yaml > > $ nc 192.168.0.1 1235 > > cat /proc/2833/cmdline > ruby/opt/lkp-tests/bin/run-local./dbench-100%.yaml > > HANG CLOSE > ---------- > cat /proc/2833/stack > [<0>] __switch_to+0x6c/0x90 > [<0>] rpc_wait_bit_killable+0x2c/0xb0 > [<0>] __rpc_wait_for_completion_task+0x3c/0x48 > [<0>] nfs4_do_close+0x1ec/0x2b0 > [<0>] __nfs4_close+0x130/0x198 > [<0>] nfs4_close_sync+0x34/0x40 > [<0>] nfs4_close_context+0x40/0x50 > [<0>] __put_nfs_open_context+0xac/0x118 > [<0>] nfs_file_clear_open_context+0x38/0x58 > [<0>] nfs_file_release+0x7c/0x90 > [<0>] __fput+0x94/0x1c0 > [<0>] ____fput+0x20/0x30 > [<0>] task_work_run+0x98/0xb8 > [<0>] do_notify_resume+0x2d0/0x318 > [<0>] work_pending+0x8/0x10 > [<0>] 0xffffffffffffffff > > HANG READ > --------- > cat /proc/2833/stack > [<0>] __switch_to+0x6c/0x90 > [<0>] io_schedule+0x20/0x40 > [<0>] wait_on_page_bit_killable+0x164/0x260 > [<0>] generic_file_read_iter+0x1c4/0x820 > [<0>] nfs_file_read+0xa4/0x108 > [<0>] __vfs_read+0x120/0x170 > [<0>] vfs_read+0x94/0x150 > [<0>] ksys_read+0x6c/0xd8 > [<0>] __arm64_sys_read+0x24/0x30 > [<0>] el0_svc_handler+0x7c/0x118 > [<0>] el0_svc+0x8/0xc > [<0>] 0xffffffffffffffff > > > HANG STAT > --------- > cat /proc/2833/stack > [<0>] __switch_to+0x6c/0x90 > [<0>] rpc_wait_bit_killable+0x2c/0xb0 > [<0>] __rpc_execute+0x1cc/0x528 > [<0>] rpc_execute+0xe4/0x1b0 > [<0>] rpc_run_task+0x130/0x168 > [<0>] nfs4_call_sync_sequence+0x80/0xc8 > [<0>] _nfs4_proc_getattr+0xc8/0xf8 > [<0>] nfs4_proc_getattr+0x88/0x1d8 > [<0>] __nfs_revalidate_inode+0x1f8/0x468 > [<0>] nfs_getattr+0x14c/0x420 > [<0>] vfs_getattr_nosec+0x7c/0x98 > [<0>] vfs_getattr+0x48/0x58 > [<0>] vfs_statx+0xb4/0x118 > [<0>] __se_sys_newfstatat+0x58/0x98 > [<0>] __arm64_sys_newfstatat+0x24/0x30 > [<0>] el0_svc_handler+0x7c/0x118 > [<0>] el0_svc+0x8/0xc > [<0>] 0xffffffffffffffff > > .... Is there anything else blocked in the RPC layer? The above are all standard tasks waiting for the rpciod/xprtiod workqueues to complete the calls to the server. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx