On 10/29/21 6:11 AM, Pavel Begunkov wrote: > INFO: task iou-wrk-6609:6612 blocked for more than 143 seconds. > Not tainted 5.15.0-rc5-syzkaller #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:iou-wrk-6609 state:D stack:27944 pid: 6612 ppid: 6526 flags:0x00004006 > Call Trace: > context_switch kernel/sched/core.c:4940 [inline] > __schedule+0xb44/0x5960 kernel/sched/core.c:6287 > schedule+0xd3/0x270 kernel/sched/core.c:6366 > schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857 > do_wait_for_common kernel/sched/completion.c:85 [inline] > __wait_for_common kernel/sched/completion.c:106 [inline] > wait_for_common kernel/sched/completion.c:117 [inline] > wait_for_completion+0x176/0x280 kernel/sched/completion.c:138 > io_worker_exit fs/io-wq.c:183 [inline] > io_wqe_worker+0x66d/0xc40 fs/io-wq.c:597 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > > io-wq worker may submit a task_work to the master task and upon > io_worker_exit() wait for the tw to get executed. The problem appears > when the master task is waiting in coredump.c: > > 468 freezer_do_not_count(); > 469 wait_for_completion(&core_state->startup); > 470 freezer_count(); > > Apparently having some dependency on children threads getting everything > stuck. Workaround it by cancelling the taks_work callback that causes it > before going into io_worker_exit() waiting. > > p.s. probably a better option is to not submit tw elevating the refcount > in the first place, but let's leave this excercise for the future. I've applied this for 5.16. It does look good to me, but not comfortable adding this to 5.15 so late in the process. -- Jens Axboe