[ 491.222908] INFO: task thread-exit:2490 blocked for more than 122 seconds. [ 491.222957] Call Trace: [ 491.222967] __schedule+0x36b/0x950 [ 491.222985] schedule+0x68/0xe0 [ 491.222994] schedule_timeout+0x209/0x2a0 [ 491.223003] ? tlb_flush_mmu+0x28/0x140 [ 491.223013] wait_for_completion+0x8b/0xf0 [ 491.223023] io_wq_destroy_manager+0x24/0x60 [ 491.223037] io_wq_put_and_exit+0x18/0x30 [ 491.223045] io_uring_clean_tctx+0x76/0xa0 [ 491.223061] __io_uring_files_cancel+0x1b9/0x2e0 [ 491.223068] ? blk_finish_plug+0x26/0x40 [ 491.223085] do_exit+0xc0/0xb40 [ 491.223099] ? syscall_trace_enter.isra.0+0x1a1/0x1e0 [ 491.223109] __x64_sys_exit+0x1b/0x20 [ 491.223117] do_syscall_64+0x38/0x50 [ 491.223131] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 491.223177] INFO: task iou-mgr-2490:2491 blocked for more than 122 seconds. [ 491.223194] Call Trace: [ 491.223198] __schedule+0x36b/0x950 [ 491.223206] ? pick_next_task_fair+0xcf/0x3e0 [ 491.223218] schedule+0x68/0xe0 [ 491.223225] schedule_timeout+0x209/0x2a0 [ 491.223236] wait_for_completion+0x8b/0xf0 [ 491.223246] io_wq_manager+0xf1/0x1d0 [ 491.223255] ? recalc_sigpending+0x1c/0x60 [ 491.223265] ? io_wq_cpu_online+0x40/0x40 [ 491.223272] ret_from_fork+0x22/0x30 Cancel all unbound works on exit, otherwise do_exit() -> io_uring_files_cancel() may wait for io-wq destruction for long, e.g. until somewhat sends a SIGKILL. Suggested-by: Jens Axboe <axboe@xxxxxxxxx> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> --- Not quite happy about it as it cancels pipes and sockets, but is probably better than waiting. fs/io-wq.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/io-wq.c b/fs/io-wq.c index 433c4d3c3c1c..e2ab569e47b9 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -702,6 +702,11 @@ static void io_wq_cancel_pending(struct io_wq *wq) io_wqe_cancel_pending_work(wq->wqes[node], &match); } +static bool io_wq_cancel_unbounded(struct io_wq_work *work, void *data) +{ + return work->flags & IO_WQ_WORK_UNBOUND; +} + /* * Manager thread. Tasked with creating new workers, if we need them. */ @@ -736,6 +741,8 @@ static int io_wq_manager(void *data) if (atomic_dec_and_test(&wq->worker_refs)) complete(&wq->worker_done); + + io_wq_cancel_cb(wq, io_wq_cancel_unbounded, NULL, true); wait_for_completion(&wq->worker_done); spin_lock_irq(&wq->hash->wait.lock); -- 2.24.0