On Fri, Dec 6, 2024 at 4:08 PM Max Kellermann <max.kellermann@xxxxxxxxx> wrote: > Similar hangs wth 6.12.2 (vanilla without your "netfs-writeback" branch): (Correction: this was 6.12.3, not 6.12.2) I tried with 6.12.3 + dhowells/netfs-writeback; David's branch solved many problems and it took much longer to trigger the hang, but it eventually occurred: INFO: task bash:6599 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #298 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:bash state:D stack:0 pid:6599 tgid:6599 ppid:6598 flags:0x00000006 Call Trace: <TASK> __schedule+0xc34/0x4df0 ? is_dynamic_key+0x120/0x150 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? schedule+0x1e8/0x340 schedule+0xdc/0x340 schedule_preempt_disabled+0xa/0x10 rwsem_down_read_slowpath+0x6ba/0xd00 ? __pfx_rwsem_down_read_slowpath+0x10/0x10 ? kernel_text_address+0xb8/0x150 ? lock_acquire+0x11f/0x290 ? ceph_start_io_read+0x19/0x80 down_read+0xcd/0x220 ? __pfx_down_read+0x10/0x10 ? do_sys_openat2+0x106/0x160 ? stack_trace_save+0x96/0xd0 ? __pfx_stack_trace_save+0x10/0x10 ceph_start_io_read+0x19/0x80 ceph_read_iter+0x2e2/0xe70 ? __pfx_ceph_read_iter+0x10/0x10 ? psi_task_switch+0x256/0x810 ? find_held_lock+0x2d/0x110 ? lock_release+0x206/0x660 ? finish_task_switch.isra.0+0x1db/0xa40 vfs_read+0x6e1/0xc40 ? lock_acquire+0x11f/0x290 ? finish_task_switch.isra.0+0x129/0xa40 ? __pfx_vfs_read+0x10/0x10 ? finish_task_switch.isra.0+0x225/0xa40 ? fdget_pos+0x1b3/0x540 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xee/0x1c0 ? __pfx_ksys_read+0x10/0x10 ? lock_release+0x206/0x660 ? __ceph_do_getattr+0xe8/0x380 do_syscall_64+0x64/0x100 ? fdget_raw+0x53/0x390 ? __do_sys_newfstatat+0x86/0xd0 ? __pfx___do_sys_newfstatat+0x10/0x10 ? syscall_exit_to_user_mode+0x57/0x120 ? do_syscall_64+0x70/0x100 ? irqentry_exit_to_user_mode+0x3d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f71fb04f21d RSP: 002b:00007ffdd516a918 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f71fb04f21d RDX: 0000000000003003 RSI: 00005597dc55d4d0 RDI: 0000000000000003 RBP: 0000000000003003 R08: 00007f71fb12a020 R09: 00007f71fb12a020 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4 R13: 00005597dc485340 R14: 00005597dc55d4d0 R15: 00005597be8d7524 </TASK> INFO: task bash:6614 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #298 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:bash state:D stack:0 pid:6614 tgid:6614 ppid:6613 flags:0x00000002 Call Trace: <TASK> __schedule+0xc34/0x4df0 ? __free_insn_slot+0x370/0x3d0 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? schedule+0x1e8/0x340 schedule+0xdc/0x340 schedule_preempt_disabled+0xa/0x10 rwsem_down_read_slowpath+0x6ba/0xd00 ? __pfx_rwsem_down_read_slowpath+0x10/0x10 ? kasan_save_stack+0x1c/0x40 ? kasan_save_track+0x10/0x30 ? lock_acquire+0x11f/0x290 ? ceph_start_io_read+0x19/0x80 ? find_held_lock+0x2d/0x110 down_read+0xcd/0x220 ? __ceph_caps_issued_mask+0x416/0xa10 ? __pfx_down_read+0x10/0x10 ceph_start_io_read+0x19/0x80 ceph_read_iter+0x2e2/0xe70 ? _copy_to_user+0x50/0x70 ? __pfx_ceph_read_iter+0x10/0x10 ? fdget_raw+0x53/0x390 vfs_read+0x6e1/0xc40 ? __do_sys_newfstatat+0x86/0xd0 ? __pfx___do_sys_newfstatat+0x10/0x10 ? __pfx_vfs_read+0x10/0x10 ? fdget_pos+0x1b3/0x540 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xee/0x1c0 ? __pfx_ksys_read+0x10/0x10 do_syscall_64+0x64/0x100 ? do_user_addr_fault+0x401/0x8f0 ? find_held_lock+0x59/0x110 ? find_held_lock+0x2d/0x110 ? lock_release+0x206/0x660 ? do_user_addr_fault+0x45e/0x8f0 ? __pfx_lock_release+0x10/0x10 ? do_user_addr_fault+0x401/0x8f0 ? do_user_addr_fault+0x463/0x8f0 ? irqentry_exit_to_user_mode+0x3d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fa84a43a21d RSP: 002b:00007ffec8720278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa84a43a21d RDX: 0000000000003003 RSI: 0000565007f224d0 RDI: 0000000000000003 RBP: 0000000000003003 R08: 00007fa84a515020 R09: 00007fa84a515020 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4 R13: 0000565007e4a340 R14: 0000565007f224d0 R15: 0000565005ba8524 </TASK> Showing all locks held in the system: 1 lock held by khungtaskd/163: #0: ffffffffae629b80 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x64/0x280 2 locks held by bash/3365: #0: ffff8881661803e0 (sb_writers#19){....}-{0:0}, at: ksys_write+0xee/0x1c0 #1: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_write+0x15/0x30 1 lock held by bash/6599: #0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_read+0x19/0x80 1 lock held by bash/6614: #0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_read+0x19/0x80 =============================================