On 3/8/21 10:30 AM, Pavel Begunkov wrote: > Don't set IO_SQ_THREAD_SHOULD_STOP when io_sq_offload_create() has > failed on io_uring_alloc_task_context() but leave everything to > io_sq_thread_finish(), because currently io_sq_thread_finish() > hangs on trying to park it. That's great it stalls there, because > otherwise the following io_sq_thread_stop() would be skipped on > IO_SQ_THREAD_SHOULD_STOP check and the sqo would race for sqd with > freeing ctx. > > A simple error injection gives something like this. > > [ 245.463955] INFO: task sqpoll-test-hang:523 blocked for more than 122 seconds. > [ 245.463983] Call Trace: > [ 245.463990] __schedule+0x36b/0x950 > [ 245.464005] schedule+0x68/0xe0 > [ 245.464013] schedule_timeout+0x209/0x2a0 > [ 245.464032] wait_for_completion+0x8b/0xf0 > [ 245.464043] io_sq_thread_finish+0x44/0x1a0 > [ 245.464049] io_uring_setup+0x9ea/0xc80 > [ 245.464058] __x64_sys_io_uring_setup+0x16/0x20 > [ 245.464064] do_syscall_64+0x38/0x50 > [ 245.464073] entry_SYSCALL_64_after_hwframe+0x44/0xae Applied, thanks. -- Jens Axboe