On 5/23/21 8:48 AM, Pavel Begunkov wrote: > There is an old problem with io-wq cancellation where requests should be > killed and are in io-wq but are not discoverable, e.g. in @next_hashed > or @linked vars of io_worker_handle_work(). It adds some unreliability > to individual request canellation, but also may potentially get > __io_uring_cancel() stuck. For instance: > > 1) An __io_uring_cancel()'s cancellation round have not found any > request but there are some as desribed. > 2) __io_uring_cancel() goes to sleep > 3) Then workers wake up and try to execute those hidden requests > that happen to be unbound. > > As we already cancel all requests of io-wq there, set IO_WQ_BIT_EXIT > in advance, so preventing 3) from executing unbound requests. The > workers will initially break looping because of getting a signal as they > are threads of the dying/exec()'ing user task. Applied, thanks. -- Jens Axboe