On Tue, Oct 16, 2018 at 11:42:48AM +0200, Jan Kara wrote: > We can hold j_state_lock for writing at the beginning of > jbd2_journal_commit_transaction() for a rather long time (reportedly for > 30 ms) due cleaning revoke bits of all revoked buffers under it. The > handling of revoke tables as well as cleaning of t_reserved_list, and > checkpoint lists does not need j_state_lock for anything. Furthermore > the transaction is in T_LOCKED state and we waited for all outstanding > handles so nobody is going to be adding anything to the transaction. > > Just drop the lock for unnecessary operations. > > Reported-and-tested-by: Adrian Hunter <adrian.hunter@xxxxxxxxx> > Suggested-by: "Theodore Y. Ts'o" <tytso@xxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> Unfortunately, this patch is causing a regression in generic/241 in the dioread_nolock configuration. To reproduce: % gce-xftests -c dioread_nolock generic/241 which will result in the decoded stack trace: [ 42.926455] run fstests generic/241 at 2018-11-06 00:12:30 [ 51.091678] JBD2: assertion failure: h_type=11 h_line_no=4921 block_no=524337 jlist=3 [ 51.101179] ------------[ cut here ]------------ [ 51.106191] kernel BUG at fs/jbd2/transaction.c:1372! [ 51.111716] invalid opcode: 0000 [#1] SMP PTI [ 51.116198] CPU: 0 PID: 177 Comm: kworker/u4:2 Not tainted 4.19.0-rc6-xfstests-00022-g895dc4ba4f98 #730 [ 51.125809] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 51.135375] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ 51.141433] RIP: 0010:jbd2_journal_dirty_metadata.cold.7 (??:?) [ 51.147867] Code: c1 ea 0c 66 c1 ee 04 0f b7 d2 40 0f b6 f6 e8 c4 64 de ff 4d 39 6c 24 28 0f 85 22 f4 ff ff 41 83 7c 24 0c 01 0f 84 16 f4 ff ff <0f> 0b 4d 85 c9 74 2c 41 8b 41 08 44 8b 41 08 49 8d b7 c8 05 00 00 All code ======== 0: c1 ea 0c shr $0xc,%edx 3: 66 c1 ee 04 shr $0x4,%si 7: 0f b7 d2 movzwl %dx,%edx a: 40 0f b6 f6 movzbl %sil,%esi e: e8 c4 64 de ff callq 0xffffffffffde64d7 13: 4d 39 6c 24 28 cmp %r13,0x28(%r12) 18: 0f 85 22 f4 ff ff jne 0xfffffffffffff440 1e: 41 83 7c 24 0c 01 cmpl $0x1,0xc(%r12) 24: 0f 84 16 f4 ff ff je 0xfffffffffffff440 2a:* 0f 0b ud2 <-- trapping instruction 2c: 4d 85 c9 test %r9,%r9 2f: 74 2c je 0x5d 31: 41 8b 41 08 mov 0x8(%r9),%eax 35: 44 8b 41 08 mov 0x8(%rcx),%r8d 39: 49 8d b7 c8 05 00 00 lea 0x5c8(%r15),%rsi Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 4d 85 c9 test %r9,%r9 5: 74 2c je 0x33 7: 41 8b 41 08 mov 0x8(%r9),%eax b: 44 8b 41 08 mov 0x8(%rcx),%r8d f: 49 8d b7 c8 05 00 00 lea 0x5c8(%r15),%rsi [ 51.166848] RSP: 0018:ffffb08481573c60 EFLAGS: 00010202 [ 51.172195] RAX: 0000000000000049 RBX: ffff8e72c98e6c98 RCX: 0000000000000000 [ 51.179737] RDX: 0000000000000000 RSI: ffff8e72dfbd54c8 RDI: ffff8e72dfbd54c8 [ 51.187020] RBP: ffff8e72d3247600 R08: 0000001969fa2eee R09: 0000000000000000 [ 51.194639] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72c21b22a0 [ 51.201985] R13: ffff8e72c224ce00 R14: 0000000000000000 R15: ffff8e72d2dc9000 [ 51.209795] FS: 0000000000000000(0000) GS:ffff8e72dfa00000(0000) knlGS:0000000000000000 [ 51.218109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.224006] CR2: 00007f67fc9bc420 CR3: 0000000174612001 CR4: 00000000001606f0 [ 51.232026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 51.240026] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 51.247810] Call Trace: [ 51.250394] __ext4_handle_dirty_metadata (include/linux/jbd2.h:1501 fs/ext4/ext4_jbd2.c:271) [ 51.255691] ext4_do_update_inode.isra.11 (fs/ext4/inode.c:5324) [ 51.261187] ? __ext4_journal_get_write_access (fs/ext4/ext4_jbd2.c:170) [ 51.266943] ? ext4_convert_unwritten_extents (fs/ext4/extents.c:4951) [ 51.272984] ext4_mark_iloc_dirty (fs/ext4/inode.c:5849) [ 51.277389] ext4_mark_inode_dirty (fs/ext4/inode.c:6028) [ 51.282260] ext4_convert_unwritten_extents (fs/ext4/extents.c:4951) [ 51.287900] ext4_end_io_rsv_work (fs/ext4/page-io.c:161 fs/ext4/page-io.c:234 fs/ext4/page-io.c:248) [ 51.292523] process_one_work (arch/x86/include/asm/jump_label.h:36 include/linux/jump_label.h:142 include/trace/events/workqueue.h:114 kernel/workqueue.c:2158) [ 51.296937] worker_thread (include/linux/compiler.h:188 include/linux/list.h:203 kernel/workqueue.c:2297) [ 51.301379] ? process_one_work (kernel/workqueue.c:2239) [ 51.305700] kthread (kernel/kthread.c:246) [ 51.309048] ? kthread_park (kernel/kthread.c:206) [ 51.312836] ret_from_fork (arch/x86/entry/entry_64.S:419) [ 51.316531] Modules linked in: sg [ 51.320170] ---[ end trace 14f78710c7a5d90b ]--- [ 51.325121] RIP: 0010:jbd2_journal_dirty_metadata.cold.7 (??:?) [ 51.331516] Code: c1 ea 0c 66 c1 ee 04 0f b7 d2 40 0f b6 f6 e8 c4 64 de ff 4d 39 6c 24 28 0f 85 22 f4 ff ff 41 83 7c 24 0c 01 0f 84 16 f4 ff ff <0f> 0b 4d 85 c9 74 2c 41 8b 41 08 44 8b 41 08 49 8d b7 c8 05 00 00 All code ======== 0: c1 ea 0c shr $0xc,%edx 3: 66 c1 ee 04 shr $0x4,%si 7: 0f b7 d2 movzwl %dx,%edx a: 40 0f b6 f6 movzbl %sil,%esi e: e8 c4 64 de ff callq 0xffffffffffde64d7 13: 4d 39 6c 24 28 cmp %r13,0x28(%r12) 18: 0f 85 22 f4 ff ff jne 0xfffffffffffff440 1e: 41 83 7c 24 0c 01 cmpl $0x1,0xc(%r12) 24: 0f 84 16 f4 ff ff je 0xfffffffffffff440 2a:* 0f 0b ud2 <-- trapping instruction 2c: 4d 85 c9 test %r9,%r9 2f: 74 2c je 0x5d 31: 41 8b 41 08 mov 0x8(%r9),%eax 35: 44 8b 41 08 mov 0x8(%rcx),%r8d 39: 49 8d b7 c8 05 00 00 lea 0x5c8(%r15),%rsi Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 4d 85 c9 test %r9,%r9 5: 74 2c je 0x33 7: 41 8b 41 08 mov 0x8(%r9),%eax b: 44 8b 41 08 mov 0x8(%rcx),%r8d f: 49 8d b7 c8 05 00 00 lea 0x5c8(%r15),%rsi [ 51.350792] RSP: 0018:ffffb08481573c60 EFLAGS: 00010202 [ 51.356407] RAX: 0000000000000049 RBX: ffff8e72c98e6c98 RCX: 0000000000000000 [ 51.363778] RDX: 0000000000000000 RSI: ffff8e72dfbd54c8 RDI: ffff8e72dfbd54c8 [ 51.371485] RBP: ffff8e72d3247600 R08: 0000001969fa2eee R09: 0000000000000000 [ 51.378856] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72c21b22a0 [ 51.386323] R13: ffff8e72c224ce00 R14: 0000000000000000 R15: ffff8e72d2dc9000 [ 51.393775] FS: 0000000000000000(0000) GS:ffff8e72dfa00000(0000) knlGS:0000000000000000 [ 51.402416] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.408605] CR2: 00007f67fc9bc420 CR3: 0000000174612001 CR4: 00000000001606f0 [ 51.416057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 51.423596] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 51.431668] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34 [ 51.441028] in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: kworker/u4:2 [ 51.448399] INFO: lockdep is turned off. [ 51.452644] CPU: 0 PID: 177 Comm: kworker/u4:2 Tainted: G D 4.19.0-rc6-xfstests-00022-g895dc4ba4f98 #730 [ 51.464004] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 51.473697] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ 51.479745] Call Trace: [ 51.482499] dump_stack (lib/dump_stack.c:115) [ 51.486326] ___might_sleep.cold.12 (kernel/sched/core.c:6145) [ 51.491043] exit_signals (include/linux/percpu-rwsem.h:36 include/linux/percpu-rwsem.h:59 include/linux/cgroup-defs.h:691 kernel/signal.c:2598) [ 51.494749] do_exit (kernel/exit.c:818) [ 51.498387] ? process_one_work (kernel/workqueue.c:2239) [ 51.502698] ? kthread (kernel/kthread.c:246) [ 51.506218] rewind_stack_do_exit (??:?) [ 51.510639] note: kworker/u4:2[177] exited with preempt_count 1