On 5/16/19 2:53 AM, Roman Penyaev wrote: > This fixes couple of races which lead to infinite wait of park completion > with the following backtraces: > > [20801.303319] Call Trace: > [20801.303321] ? __schedule+0x284/0x650 > [20801.303323] schedule+0x33/0xc0 > [20801.303324] schedule_timeout+0x1bc/0x210 > [20801.303326] ? schedule+0x3d/0xc0 > [20801.303327] ? schedule_timeout+0x1bc/0x210 > [20801.303329] ? preempt_count_add+0x79/0xb0 > [20801.303330] wait_for_completion+0xa5/0x120 > [20801.303331] ? wake_up_q+0x70/0x70 > [20801.303333] kthread_park+0x48/0x80 > [20801.303335] io_finish_async+0x2c/0x70 > [20801.303336] io_ring_ctx_wait_and_kill+0x95/0x180 > [20801.303338] io_uring_release+0x1c/0x20 > [20801.303339] __fput+0xad/0x210 > [20801.303341] task_work_run+0x8f/0xb0 > [20801.303342] exit_to_usermode_loop+0xa0/0xb0 > [20801.303343] do_syscall_64+0xe0/0x100 > [20801.303349] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [20801.303380] Call Trace: > [20801.303383] ? __schedule+0x284/0x650 > [20801.303384] schedule+0x33/0xc0 > [20801.303386] io_sq_thread+0x38a/0x410 > [20801.303388] ? __switch_to_asm+0x40/0x70 > [20801.303390] ? wait_woken+0x80/0x80 > [20801.303392] ? _raw_spin_lock_irqsave+0x17/0x40 > [20801.303394] ? io_submit_sqes+0x120/0x120 > [20801.303395] kthread+0x112/0x130 > [20801.303396] ? kthread_create_on_node+0x60/0x60 > [20801.303398] ret_from_fork+0x35/0x40 > > o kthread_park() waits for park completion, so io_sq_thread() loop > should check kthread_should_park() along with khread_should_stop(), > otherwise if kthread_park() is called before prepare_to_wait() > the following schedule() never returns: > > CPU#0 CPU#1 > > io_sq_thread_stop(): io_sq_thread(): > > while(!kthread_should_stop() && !ctx->sqo_stop) { > > ctx->sqo_stop = 1; > kthread_park() > > prepare_to_wait(); > if (kthread_should_stop() { > } > schedule(); <<< nobody checks park flag, > <<< so schedule and never return > > o if the flag ctx->sqo_stop is observed by the io_sq_thread() loop > it is quite possible, that kthread_should_park() check and the > following kthread_parkme() is never called, because kthread_park() > has not been yet called, but few moments later is is called and > waits there for park completion, which never happens, because > kthread has already exited: > > CPU#0 CPU#1 > > io_sq_thread_stop(): io_sq_thread(): > > ctx->sqo_stop = 1; > while(!kthread_should_stop() && !ctx->sqo_stop) { > <<< observe sqo_stop and exit the loop > } > > if (kthread_should_park()) > kthread_parkme(); <<< never called, since was > <<< never parked > > kthread_park() <<< waits forever for park completion > > In the current patch we quit the loop by only kthread_should_park() > check (kthread_park() is synchronous, so kthread_should_stop() is > never observed), and we abandon ->sqo_stop flag, since it is racy. > At the end of the io_sq_thread() we unconditionally call parmke(), > since we've exited the loop by the park flag. Thanks Roman, looks (and tests) good. Applied. -- Jens Axboe