On Mon, Jan 27, 2020 at 12:37:55PM +0900, Satoru Takeuchi wrote: > In Rook(*1)/Ceph community, some users encountered hang_task in XFS. > Although we've not reproduced this problem in the newest kernel, could anyone > give us any hint about this problem, if possible? > > *1) An Ceph orchestration in Kubernetes > > Here is the detail. > > Under some workload in Ceph, many processes got to hang_task. We found > that there > are two kinds of processes. > > a) In very high CPU load > b) Encountered hang_task in the XFS > > In addition,a user got the following two kernel traces. > > A (b) process's backtrace with `hung_task_panic=1`. > > ``` > [51717.039319] INFO: task kworker/2:1:5938 blocked for more than 120 seconds. > [51717.039361] Not tainted 4.15.0-72-generic #81-Ubuntu Kinda old, and not an upstream LTS kernel, right? > [51717.039388] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [51717.039426] kworker/2:1 D 0 5938 2 0x80000000 > [51717.039471] Workqueue: xfs-sync/rbd0 xfs_log_worker [xfs] Filesystem is on a Ceph RBD device. > [51717.039472] Call Trace: > [51717.039478] __schedule+0x24e/0x880 > [51717.039504] ? xlog_sync+0x2d5/0x3c0 [xfs] > [51717.039506] schedule+0x2c/0x80 > [51717.039530] _xfs_log_force_lsn+0x20e/0x350 [xfs] > [51717.039533] ? wake_up_q+0x80/0x80 > [51717.039556] __xfs_trans_commit+0x20b/0x280 [xfs] > [51717.039577] xfs_trans_commit+0x10/0x20 [xfs] > [51717.039600] xfs_sync_sb+0x6d/0x80 [xfs] > [51717.039623] xfs_log_worker+0xe7/0x100 [xfs] > [51717.039626] process_one_work+0x1de/0x420 > [51717.039627] worker_thread+0x32/0x410 > [51717.039628] kthread+0x121/0x140 > [51717.039630] ? process_one_work+0x420/0x420 > [51717.039631] ? kthread_create_worker_on_cpu+0x70/0x70 > [51717.039633] ret_from_fork+0x35/0x40 That's waiting for log IO completion. > ``` > > A (b) process's backtrace that is got by `sudo cat /proc/<PID of a D > process>/stack` > > ``` > [<0>] _xfs_log_force_lsn+0x20e/0x350 [xfs] > [<0>] __xfs_trans_commit+0x20b/0x280 [xfs] > [<0>] xfs_trans_commit+0x10/0x20 [xfs] > [<0>] xfs_sync_sb+0x6d/0x80 [xfs] > [<0>] xfs_log_sbcount+0x4b/0x60 [xfs] > [<0>] xfs_unmountfs+0xe7/0x200 [xfs] > [<0>] xfs_fs_put_super+0x3e/0xb0 [xfs] > [<0>] generic_shutdown_super+0x72/0x120 > [<0>] kill_block_super+0x2c/0x80 > [<0>] deactivate_locked_super+0x48/0x80 > [<0>] deactivate_super+0x40/0x60 > [<0>] cleanup_mnt+0x3f/0x80 > [<0>] __cleanup_mnt+0x12/0x20 > [<0>] task_work_run+0x9d/0xc0 > [<0>] exit_to_usermode_loop+0xc0/0xd0 > [<0>] do_syscall_64+0x121/0x130 > [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [<0>] 0xffffffffffffffff ANd this is the last reference to the filesystem being dropped and it waiting for log IO completion. So, the filesytem has been unmounted, and it's waiting for journal IO on the device to complete. I wonder if a wakeup was missed somewhere? Did the system stop/tear down /dev/rbd0 prematurely? > Related discussions: > - Issue of Rook: > https://github.com/rook/rook/issues/3132 > - Issue of Ceph > https://tracker.ceph.com/issues/40068 These point to Ceph RBDs failing to respond under high load and tasks hanging because they are waiting on IO. That's exactly the symptoms you are reporting here. That points to it being a Ceph RBD issue to me, especially the reports where rbd devices report no IO load but the ceph back end is at 100% disk utilisation doing -something-. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx