On Fri, Feb 21, 2020 at 06:49:16AM -0800, Jens Axboe wrote: > > Jens, what exactly is the benefit of running this on every random > > schedule() vs in io_cqring_wait() ? Or even, since io_cqring_wait() is > > the very last thing the syscall does, task_work. > > I took a step back and I think we can just use the task work, which > makes this a lot less complicated in terms of locking and schedule > state. Ran some quick testing with the below and it works for me. > > I'm going to re-spin based on this and just dump the sched_work > addition. Aswesome, simpler is better. > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 81aa3959f326..413ac86d7882 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -3529,7 +3529,7 @@ static int __io_async_wake(struct io_kiocb *req, struct io_poll_iocb *poll, > * the exit check will ultimately cancel these work items. Hence we > * don't need to check here and handle it specifically. > */ > - sched_work_add(tsk, &req->sched_work); > + task_work_add(tsk, &req->sched_work, true); > wake_up_process(tsk); > return 1; > } > @@ -5367,9 +5367,9 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, > do { > if (io_cqring_events(ctx, false) >= min_events) > return 0; > - if (!current->sched_work) > + if (!current->task_works) > break; > - sched_work_run(); > + task_work_run(); > } while (1); > > if (sig) { > @@ -5392,6 +5392,12 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, > TASK_INTERRUPTIBLE); > if (io_should_wake(&iowq, false)) > break; > + if (current->task_works) { > + task_work_run(); > + if (io_should_wake(&iowq, false)) > + break; > + continue; > + } if (current->task_works) task_work_run(); if (io_should_wake(&iowq, false); break; doesn't work? > schedule(); > if (signal_pending(current)) { > ret = -EINTR; Anyway, we need to be careful about the context where we call task_work_run(), but afaict doing it here should be fine.