On 4/7/20 1:30 PM, Jens Axboe wrote: > On 4/7/20 9:38 AM, Oleg Nesterov wrote: >> On 04/07, Oleg Nesterov wrote: >>> >>> On 04/07, Jens Axboe wrote: >>>> >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -7293,10 +7293,15 @@ static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx) >>>> io_wq_cancel_all(ctx->io_wq); >>>> >>>> io_iopoll_reap_events(ctx); >>>> + idr_for_each(&ctx->personality_idr, io_remove_personalities, ctx); >>>> + >>>> + if (current->task_works != &task_work_exited) >>>> + task_work_run(); >>> >>> this is still wrong, please see the email I sent a minute ago. >> >> Let me try to explain in case it was not clear. Lets forget about io_uring. >> >> void bad_work_func(struct callback_head *cb) >> { >> task_work_run(); >> } >> >> ... >> >> init_task_work(&my_work, bad_work_func); >> >> task_work_add(task, &my_work); >> >> If the "task" above is exiting the kernel will crash; because the 2nd >> task_work_run() called by bad_work_func() will install work_exited, then >> we return to task_work_run() which was called by exit_task_work(), it will >> notice ->task_works != NULL, restart the main loop, and execute >> work_exited->fn == NULL. >> >> Again, if we want to allow task_work_run() in do_exit() paths we need >> something like below. But still do not understand why do we need this :/ > > The crash I sent was from the exit path, I don't think we need to run > the task_work for that case, as the ordering should imply that we either > queue the work with the task (if not exiting), and it'll get run just fine, > or we queue it with another task. For both those cases, no need to run > the local task work. > > io_uring exit removes the pending poll requests, but what if (for non > exit invocation), we get poll requests completing before they are torn > down. Now we have task_work queued up that won't get run, because we > are are in the task_work handler for the __fput(). For this case, we > need to run the task work. > > But I can't tell them apart easily, hence I don't know when it's safe > to run it. That's what I'm trying to solve by exposing task_work_exited > so I can check for that specifically. Not really a great solution as > it doesn't tell me which of the cases I'm in, but at least it tells me > if it's safe to run the task work? It's also possible I totally mis-analyzed it, and it really is back to "just" being an ordering issue than I then work-around by re-running the task_work within the handler. -- Jens Axboe