On 1/18/22 4:36 PM, Jens Axboe wrote: > On 1/18/22 1:05 PM, Florian Fischer wrote: >>>> After reading the io_uring_enter(2) man page a IORING_OP_ASYNC_CANCEL's return value of -EALREADY apparently >>>> may not cause the request to terminate. At least that is our interpretation of "…res field will contain -EALREADY. >>>> In this case, the request may or may not terminate." >>> >>> I took a look at this, and my theory is that the request cancelation >>> ends up happening right in between when the work item is moved between >>> the work list and to the worker itself. The way the async queue works, >>> the work item is sitting in a list until it gets assigned by a worker. >>> When that assignment happens, it's removed from the general work list >>> and then assigned to the worker itself. There's a small gap there where >>> the work cannot be found in the general list, and isn't yet findable in >>> the worker itself either. >>> >>> Do you always see -ENOENT from the cancel when you get the hang >>> condition? >> >> No we also and actually more commonly observe cancel returning >> -EALREADY and the canceled read request never gets completed. >> >> As shown in the log snippet I included below. > > I think there are a couple of different cases here. Can you try the > below patch? It's against current -git. Cleaned it up and split it into functional bits, end result is here: https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.17 -- Jens Axboe