Hi EXT4 Experts, We came cross an ext4 journaling problem, and I did some googling on high I/O wait, found a lot of links, but none of them is the same. The backtrace when system is hung is : Aug 7 11:31:33 2f4526ab kernel: "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. ... Aug 7 11:31:33 2f4526ab kernel: Call Trace: Aug 7 11:31:33 2f4526ab kernel: [<ffffffff816096a9>] schedule+0x29/0x70 Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa031d785>] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Aug 7 11:31:33 2f4526ab kernel: [<ffffffff81098230>] ? wake_up_bit+0x30/0x30 Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa031fe32>] jbd2_complete_transaction+0x52/0xa0 [jbd2] Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa03339f2>] ext4_sync_file+0x292/0x320 [ext4] Aug 7 11:31:33 2f4526ab kernel: [<ffffffff811f6ba5>] do_fsync+0x65/0xa0 Aug 7 11:31:33 2f4526ab kernel: [<ffffffff811f6e70>] SyS_fsync+0x10/0x20 Aug 7 11:31:33 2f4526ab kernel: [<ffffffff81614209>] system_call_fastpath+0x16/0x1b At application layer, we use a lot of fsync APIs. When I/O load is heavy, we saw a lot of I/Os can't finish within 120 seconds because JBD2 can't finish the journaling. We also see when this happens, the JBD2 info shows very high on the following two items: $ cat /proc/fs/jbd2/nvme0n1/info 85162438 transactions (85162436 requested), each up to 8192 blocks average: 0ms waiting for transaction 0ms request delay 1ms running transaction <==== this will be very high, 3 to 6 seconds 0ms transaction was being locked 0ms flushing data (in ordered mode) 0ms logging transaction 153us average transaction commit time <=== this will be very high, half to one second 35 handles per transaction 2 blocks per transaction 3 logged blocks per transaction Our environment : $ mount /dev/nvme0n1 on /logs type ext4 (rw,noatime,data=ordered) $ uname -a Linux a69ac961 3.10.0-229.7.2.el7.x86_64 #1 SMP Tue Jun 23 22:06:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core) $ sudo tune2fs -l /dev/nvme0n1 tune2fs 1.42.9 (28-Dec-2013) Filesystem volume name: cty-data Last mounted on: <not available> Filesystem UUID: 227fde0d-a3ae-4236-9d32-46ef0091b3b3 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 91062272 Block count: 364226560 Reserved block count: 3642265 Free blocks: 126192987 Free inodes: 90916973 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Wed Aug 5 13:00:41 2015 Last mount time: Mon Aug 10 18:32:59 2015 Last write time: Mon Aug 10 18:32:59 2015 Mount count: 6 Maximum mount count: -1 Last checked: Wed Aug 5 13:00:41 2015 Check interval: 0 (<none>) Lifetime writes: 135 MB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: c8f8aa7c-583b-4251 Any suggestions or hints how to debug this problem will be appreciated! Thank you very much, Roy © 2015 Microsoft Terms Privacy & cookies Developers English (United States) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html