On 3/12/21 2:41 PM, Pavel Begunkov wrote: > On 12/03/2021 19:40, Jens Axboe wrote: >> On 3/12/21 12:35 PM, Pavel Begunkov wrote: >>> On 11/03/2021 23:29, Pavel Begunkov wrote: >>>> 1) The first problem is io_uring_cancel_sqpoll() -> >>>> io_uring_cancel_task_requests() basically doing park(); park(); and so >>>> hanging. >>>> >>>> 2) Another one is more subtle, when the master task is doing cancellations, >>>> but SQPOLL task submits in-between the end of the cancellation but >>>> before finish() requests taking a ref to the ctx, and so eternally >>>> locking it up. >>>> >>>> 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and >>>> same io_uring_cancel_sqpoll() from the owner task, they race for >>>> tctx->wait events. And there probably more of them. >>>> >>>> Instead do SQPOLL cancellations from within SQPOLL task context via >>>> task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal >>>> park()/unpark() during cancellation, which is ugly, subtle and anyway >>>> doesn't allow to do io_run_task_work() properly.> >>>> io_uring_cancel_sqpoll() is called only from SQPOLL task context and >>>> under sqd locking, so all parking is removed from there. And so, >>>> io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by >>>> SQPOLL task, and that spare us from some headache. >>>> >>>> Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, >>>> which is not used anymore. >>> >>> >>> Looks, the chunk below somehow slipped from the patch. Not important >>> for 5.12, but can can be folded anyway >>> >>> diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h >>> index 9761a0ec9f95..c24c62b47745 100644 >>> --- a/include/linux/io_uring.h >>> +++ b/include/linux/io_uring.h >>> @@ -22,7 +22,6 @@ struct io_uring_task { >>> void *io_wq; >>> struct percpu_counter inflight; >>> atomic_t in_idle; >>> - bool sqpoll; >>> >>> spinlock_t task_lock; >>> struct io_wq_work_list task_list; >> >> Let's do it as a separate patch instead. > > Ok, I'll send it for-5.13 when it's appropriate. Yeah that's fine, obviously no rush. I'll rebase for-5.13/io_uring when -rc3 is out. -- Jens Axboe