On 20/12/2020 00:25, Jens Axboe wrote: > On 12/19/20 4:42 PM, Pavel Begunkov wrote: >> On 19/12/2020 23:13, Jens Axboe wrote: >>> On 12/19/20 2:54 PM, Jens Axboe wrote: >>>> On 12/19/20 1:51 PM, Josef wrote: >>>>>> And even more so, it's IOSQE_ASYNC on the IORING_OP_READ on an eventfd >>>>>> file descriptor. You probably don't want/mean to do that as it's >>>>>> pollable, I guess it's done because you just set it on all reads for the >>>>>> test? >>>>> >>>>> yes exactly, eventfd fd is blocking, so it actually makes no sense to >>>>> use IOSQE_ASYNC >>>> >>>> Right, and it's pollable too. >>>> >>>>> I just tested eventfd without the IOSQE_ASYNC flag, it seems to work >>>>> in my tests, thanks a lot :) >>>>> >>>>>> In any case, it should of course work. This is the leftover trace when >>>>>> we should be exiting, but an io-wq worker is still trying to get data >>>>>> from the eventfd: >>>>> >>>>> interesting, btw what kind of tool do you use for kernel debugging? >>>> >>>> Just poking at it and thinking about it, no hidden magic I'm afraid... >>> >>> Josef, can you try with this added? Looks bigger than it is, most of it >>> is just moving one function below another. >> >> Hmm, which kernel revision are you poking? Seems it doesn't match >> io_uring-5.10, and for 5.11 io_uring_cancel_files() is never called with >> NULL files. >> >> if (!files) >> __io_uring_cancel_task_requests(ctx, task); >> else >> io_uring_cancel_files(ctx, task, files); > > Yeah, I think I messed up. If files == NULL, then the task is going away. > So we should cancel all requests that match 'task', not just ones that > match task && files. > > Not sure I have much more time to look into this before next week, but > something like that. > > The problem case is the async worker being queued, long before the task > is killed and the contexts go away. But from exit_files(), we're only > concerned with canceling if we have inflight. Doesn't look right to me. In theory all that should be killed in io_ring_ctx_wait_and_kill(), of course that's if the ring itself is closed. Guys, do you share rings between processes? Explicitly like sending io_uring fd over a socket, or implicitly e.g. sharing fd tables (threads), or cloning with copying fd tables (and so taking a ref to a ring). In other words, if you kill all your io_uring applications, does it go back to normal? -- Pavel Begunkov