On 01/04/2021 02:17, Jens Axboe wrote: > On 3/31/21 5:18 PM, Pavel Begunkov wrote: >> [ 491.222908] INFO: task thread-exit:2490 blocked for more than 122 seconds. >> [ 491.222957] Call Trace: >> [ 491.222967] __schedule+0x36b/0x950 >> [ 491.222985] schedule+0x68/0xe0 >> [ 491.222994] schedule_timeout+0x209/0x2a0 >> [ 491.223003] ? tlb_flush_mmu+0x28/0x140 >> [ 491.223013] wait_for_completion+0x8b/0xf0 >> [ 491.223023] io_wq_destroy_manager+0x24/0x60 >> [ 491.223037] io_wq_put_and_exit+0x18/0x30 >> [ 491.223045] io_uring_clean_tctx+0x76/0xa0 >> [ 491.223061] __io_uring_files_cancel+0x1b9/0x2e0 >> [ 491.223068] ? blk_finish_plug+0x26/0x40 >> [ 491.223085] do_exit+0xc0/0xb40 >> [ 491.223099] ? syscall_trace_enter.isra.0+0x1a1/0x1e0 >> [ 491.223109] __x64_sys_exit+0x1b/0x20 >> [ 491.223117] do_syscall_64+0x38/0x50 >> [ 491.223131] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [ 491.223177] INFO: task iou-mgr-2490:2491 blocked for more than 122 seconds. >> [ 491.223194] Call Trace: >> [ 491.223198] __schedule+0x36b/0x950 >> [ 491.223206] ? pick_next_task_fair+0xcf/0x3e0 >> [ 491.223218] schedule+0x68/0xe0 >> [ 491.223225] schedule_timeout+0x209/0x2a0 >> [ 491.223236] wait_for_completion+0x8b/0xf0 >> [ 491.223246] io_wq_manager+0xf1/0x1d0 >> [ 491.223255] ? recalc_sigpending+0x1c/0x60 >> [ 491.223265] ? io_wq_cpu_online+0x40/0x40 >> [ 491.223272] ret_from_fork+0x22/0x30 >> >> When io-wq worker exits and sees IO_WQ_BIT_EXIT it tries not cancel all >> left requests but to execute them, hence we may wait for the exiting >> task for long until someone pushes it, e.g. with SIGKILL. Actively >> cancel pending work items on io-wq destruction. >> >> note: io_run_cancel() moved up without any changes. > > Just to pull some of the discussion in here - I don't think this is a > good idea as-is. At the very least, this should be gated on UNBOUND, > and just waiting for bounded requests while canceling unbounded ones. Right, and this may be unexpected for userspace as well, e.g. sockets/pipes. Another approach would be go executing for some time, but if doesn't help go and kill them all. Or mixture of both. This at least would give a chance for socket ops to get it done if it's dynamic and doesn't stuck waiting. Though, as the original problem it locks do_exit() for some time, that's not nice, so maybe it would need deferring this final io-wq execution to async and letting do_exit() to proceed. -- Pavel Begunkov