On 4/8/20 1:25 PM, Jens Axboe wrote: > On 4/8/20 1:17 PM, Oleg Nesterov wrote: >> On 04/08, Jens Axboe wrote: >>> >>> Here's some more data. I added a WARN_ON_ONCE() for task->flags & >>> PF_EXITING on task_work_add() success, and it triggers with the >>> following backtrace: >> ... >>> which means that we've successfully added the task_work while the >>> process is exiting. >> >> but this is fine, task_work_add(task) can succeed if task->flags & EXITING. >> >> task_work_add(task, work) should only fail if this "task" has already passed >> exit_task_work(). Because if this task has already passed exit_task_work(), >> nothing else can flush this work and call work->func(). > > So the question remains, we basically have this: > > A B > task_work_run(tsk) > task_work_add(tsk, io_poll_task_func()) > process cbs > wait_for_completion() > > with the last wait needing to flush the work added on the B side, since > that isn't part of the initial list. > > I don't I can fully close that race _without_ re-running task work > there. Could do something ala: > > A B > mark context "dead" > task_work_run(tsk) > if (context dead) > task_work_add(helper, io_poll_task_func()) > else > task_work_add(tsk, io_poll_task_func()) > process cbs > wait_for_completion() > > which would do the trick, but I still need to flush work after having > marked the context dead. Actually, I guess it's not enough to re-run the work, we could also have ordering issues if we have io_poll_task_func() after the fput of the ring. Maybe this could all work just fine if we just make the ring exit non-blocking... Testing. -- Jens Axboe